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In the picture of eternal inflation, our observable universe resides inside a single bubble nucleated 
from an inflating false vacuum. Many of the theories giving rise to eternal inflation predict that we 
have causal access to collisions with other bubble universes, providing an opportunity to confront 
these theories with observation. We present the results from the first observational search for the 
effects of bubble collisions, using cosmic microwave background data from the WMAP satellite. 
Our search targets a generic set of properties associated with a bubble collision spacetime, which we 
describe in detail. We use a modular algorithm that is designed to avoid a posteriori selection effects, 
automatically picking out the most promising signals, performing a search for causal boundaries, 
and conducting a full Bayesian parameter estimation and model selection analysis. We outline each 
component of this algorithm, describing its response to simulated CMB skies with and without 
bubble collisions. Comparing the results for simulated bubble collisions to the results from an 
analysis of the WMAP 7-year data, we rule out bubble collisions over a range of parameter space. 
Our model selection results based on WMAP 7-year data do not warrant augmenting ACDM with 
bubble collisions. Data from the Planck satellite can be used to more definitively test the bubble 
collision hypothesis. 
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I. INTRODUCTION 

Observations of the cosmic microwave background (CMB) radiation have ushered in a new era of precision cosmology. 
Fuh-sky temperature maps produced by the Wilkinson Microwave Anisotropy Probe (WMAP) [1 have confirmed with 
high precision that the observed temperature fluctuations are consistent with a nearly Gaussian and scale invariant 
primordial power spectrum, as predicted by inflation. The recently launched Planck satellite [2 has a resolution three 
times better than that of WMAP, with an order of magnitude greater sensitivity, and significantly wider frequency 
coverage (allowing for far more robust foreground removal, and therefore reduced systematics). These high quality 
data sets allow for the possibility of observing deviations from the standard inflationary paradigm, some of which 
could have drastic consequences for our understanding of the universe and its origins. 

Perhaps the largest gap in our description of the early universe lies in an understanding of its initial conditions. 
One possibility, motivated by the proliferation of vacua in compactifications of string theory (known as the string 
theory landscape [3 ), is that our observable universe is only a tiny piece of a vast multiverse, the majority of which 
is still inflating. This picture of eternal inflation (for a review, see, e.g., Ref. [4 ) arises when the rate at which local 
regions exit an inflating phase is outpaced by the accelerated expansion of the inflating background. Eternal inflation 
is a fairly generic consequence of any theory containing positive vacuum energy and multiple vacua, highlighting the 
importance of understanding how this scenario might be confronted with observational tests. 

The first attempts to embed our cosmology inside an eternally inflating universe led to "open inflation" O [6] ; see 
Ref. [7 for a review. In this scenario, a scalar field (or set of scalar fields) has a potential with a high energy metastable 
minimum that drives the eternally inflating phase. Transitions out of this vacuum proceed via the Coleman-de Luccia 
(CDL) instanton [8l|9], resulting in expanding bubbles inside which the scalar field rests on an inflationary plateau. 
The symmetries of the CDL instanton ensure that there is a very nearly homogeneous and isotropic open universe 
inside the bubble; inflation, reheating, and standard cosmological evolution follow. 

In any given bubble, the future light cone of the nucleation event forms the "Big Bang" (where the scale factor 
vanishes) of an open FRW universe. The eternally inflating phase outside our bubble can therefore be thought of as 
a pre-Big Bang epoch, and one might expect inflation to erase any of the scant observational evidence of our parent 
vacuum. In single bubble open inflation, various anomalies are induced in the CMB temperature power spectrum (see 
Ref. [7] and references therein), but unfortunately, the size of these effects decreases with the present energy density 
in curvature (related to the number of inflationary e-folds), rendering them negligible at all but the lowest multipoles 
where cosmic variance dominates. However, our bubble does not evolve in isolation. There are other nucleation events 
from the false vacuum, containing a phase that might be identical to ours, or perhaps very different. If one of these 
secondary nucleation events occurs close enough to our bubble wall, then a collision inevitably results. In fact, since 
our bubble grows to reach infinite size, there are an infinite number of collisions [10 -13 (a finite subset of which are 
causally accessible to any one observer) . This raises the possibility that if such collisions are both survivable and only 
small perturbations on top of standard cosmology, they might leave observable signatures of eternal inflation [14 ; it 
is these signatures which our analysis targets. 

If we are to detect such bubble collisions, their predicted signatures must be consistent with our observed cosmology, 
but sufficiently distinct to be differentiated from other possible signals in the CMB. In addition, the theory must predict 
that we expect to have causal access to bubble collisions. While these criteria are not met in every model of eternal 
inflation, recent work p!QH26] (for a review, see Ref. [27]) has established that bubble collisions could in some theories 
be both expected and detectable. Bubble collisions produce a fairly characteristic set of inhomogeneities in the very 
early universe, which are processed into temperature anisotropics in the CMB. From the spherical symmetry of the 
colliding bubbles, the collision possess azimuthal symmetry, and by causality must be confined to a disc on the sky. 
The CMB temperature and its derivatives need not be constant across the causal boundary. Therefore, the signals 
we are searching for are localized, and because they are primordial, consist of a long-wavelength modulation of the 
standard inflationary density fluctuations inside the affected region [20 . The amplitude and angular scale of the 
signal is dependent upon the underlying model and kinematics of the collision. 

These general features suggest a set of strategies for data analysis. The localization of the collision implies that 
wavelet analysis could be a sensitive tool for picking out both the location and angular scale of a candidate signal. 
The causal boundary, across which the temperature and its derivatives need not be constant, suggests the use of 
edge detection algorithms similar to those used in searches for cosmic strings [28ti3Tj . Finally, the prediction that the 
temperature modulation induced by the collision is rather long-wavelength yields a sufficiently generic template to 
perform a full Bayesian parameter estimation and model selection analysis. 

In this paper, we describe a modular analysis algorithm designed to look for the signatures of eternal inflation, 
and apply it to the WMAP 7-year data [32]. This algorithm can easily be adapted to test any model that predicts 
a population of spatially localized sources in addition to the standard fluctuations predicted by ACDM. A summary 
of our results was presented in Ref. [33 ; in this paper we describe our analysis in detail. Currently available full sky 
CMB data are rather limited in their sensitivity to the signatures of bubble collisions listed above; the main current 
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limitation is the low resolution. Therefore, we apply our algorithm to current data mainly as a validation exercise; to 
exploit its full power would require future high resolution data, e.g., from Planck. 

The individual steps of our analysis pipeline are calibrated using realistic simulations of the WMAP experiment with 
and without bubble collisions. The calibrated pipeline applied to data is fully automated, identifying the candidate 
signals and processing them without any human intervention. This removes any a posteriori choices from our analysis, 
which must be carefully avoided in any analysis of a large data-set such as the WMAP 7- year data [34 . 

The plan of the paper is as follows. In Sec. [TTj we review some of the background on bubble collisions in eternal 
inflation, and outline the predicted observable signatures. Our analysis pipeline is summarized in Sec.|III[ We describe 
some properties of the WMAP experiment and our simulations in Sec. |IV[ and detail our analysis tools in Sec. |V[ 



Sec. VI summarizes the results of our analysis of the WMAP 7-year data, and we conclude in Sec. VII 



II. THE OBSERVABLE EFFECTS OF BUBBLE COLLISIONS 



The simplest model of eternal inflation involves a single scalar field in four dimensions, with a double- well potential. 
In many models (as long as the average curvature of the potential between the minima is small compared to the 
Planck scale), the Coleman-de Luccia (CDL) instanton [319] mediates a transition from the false (higher energy) to 
the true (lower energy) vacuum. This tunneling event corresponds to the appearance of an expanding bubble of the 
true vacuum embedded in the false. As long as the probability that a bubble nucleates in each horizon volume of the 
false vacuum during a Hubble time is less than one (so that the background expansion of the false vacuum on average 
prevents bubbles from merging), the phase transition never completes and inflation is eternal [10 (see Ref. \35^ for a 
modern treatment of the percolation problem in eternal inflation). The 0(4)-invariance of the instanton guarantees 
that the bubble interior possesses SO (3,1) symmetry, and therefore contains an infinite open Friedman Robertson 
Walker (FRW) universe. Although homogeneity is ensured by the symmetries of the instanton, if the interior of a 
bubble is to resemble our own universe, a second epoch of inflation inside the bubble is necessary to dilute the negative 
curvature and provide the correct spectrum of primordial density perturbations to seed structure. Models of this type 
are known in the literature as open inflation, and have been explored in detail (see Ref. [7 ). 

The signatures of single-bubble open inflation include negative curvature and modifications to the power spectrum. 
These modifications are most important at large angular scales (see Ref. [7, and references therein) where cosmic 
variance is dominant, and would be very difficult to detect. Since curvature alone would not be a very distinguishing 
prediction, we do not consider these signals further. 

A less ambiguous signature of eternal inflation would be the visible remnants of collisions between bubbles. Although 
the bubbles formed during eternal inflation do not percolate, there are many (in fact, an infinite number of) collisions. 
These collisions lead to inhomogeneities in the inner-bubble cosmology, perhaps leaving observable signatures in the 
CMB. Assessing the observational consequences of bubble collisions in an eternally inflating universe has been an 
active area of research [13[ 114^ [T7H26] (for a review, see Ref. [27^). These studies have established that a number of 
criteria necessary for the observation of bubble collisions [14] can be satisfied, at least in some models: 

Compatibility: In order to satisfy this criterion, there must be a bubble we can collide with that only minimally 
disturbs the homogeneity of the observable portion of the surface of last scattering. Such collisions do seem to exist, 
as evidenced by thin- wall junction condition analysis [TTllin] as well as numerical simulations [18] and a study of the 
infiaton field in a background thin- wall collision geometry [20^. 

Probability: We should expect to have a collision in our causal past. The number of collisions in our past is 
N = XV f , where A is the bubble nucleation probability per unit four- volume and Vf^ is the four- volume outside the 
bubble to which we have causal access. The expected number (assuming the original open FRW foliation) is formally 
infinite [14]; however, collisions that contribute to this divergence only produce very long wavelength fluctuations at 
last scattering, and so would not be observable [TS", ^22^ (this is similar to the infrared divergence found in models of 
slow-roll inflation). The average number of collisions that affect the observable portion of the surface of last scattering 
is finite |22j [27] , and is given by 

where Hp is the false vacuum Hubble constant, Hj is the inflationary Hubble constant inside our bubble, and is 
the current component of energy in curvature. For the expected number of observable collisions to be one or larger, 
the separation of scales between Hp and Hj must be large enough to compensate for the low probability A (which is 
exponentially suppressed because this is a tunneling process) and the observational constraint ^ .0084 [36]. Given 
a particular scalar potential underlying eternal inflation, N for each possible type of collision is fixed. However, in 
a theory with a complicated potential landscape for the scalar field(s), it makes sense to think of iV as a continuous 
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parameter with some prior probability distributiorQ Without detailed knowledge of the theory underlying eternal 
inflation and an associated measure, it is difficult to assess how likely it is to have TV > 1, but see Refs. [22l [27] 
for some speculative comments. There is also an exponential pressure from the nucleation rates towards iV ^ 1 or 
TV <C 1. In the following, we assume N can be order one. 

Observability: Since the effects of a collision must pass through the entire inner-bubble cosmology, they can be 
thought of as perturbations of the Big Bang in an FRW cosmology. As such, they are stretched by inflation, and 
we expect the strength of most signatures to scale with (some power of) Qk- We therefore must require that there 
are not too many more e- folds than required to satisfy the observational bound on curvature. Given a field theory 
model, the number of e-folds of inflation inside the bubble is uniquely determined by the instanton. However, if we 
consider a landscape of scalar potentials, then it is necessary to find a measure over the number of inflationary e-folds 
(or equivalently Qk)- For some work in this direction, see, e.g., Refs j37l [38;. 

Much remains to be learned about the full spectrum of possible outcomes of a bubble collision and the exact details 
of the associated observational signatures. Nevertheless, all potentially observable bubble collisions involving two 
bubbles share a sufficiently general set of properties to allow for a meaningful observational search even in the absence 
of a detailed model. In summary, we expect all such observable bubble collisions to possess: 



Azimuthal symmetry: A collision leaves an imprint on the CMB sky that has azimuthal symmetry. This is a 
consequence of the S0(2,l) symmetry of the spacetime describing the collision of two vacuum bubbles p^[T4l [T6]. 



• A causal boundary: The surface of last scattering can only be affected inside the future light cone of a collision 
event. The intersection of our past light cone, the future light cone of a collision, and the surface of last scattering 
is a ring. This is the causal boundary of the collision on the CMB sky. The temperature and its derivatives 
need not be continuous across this boundary. Neglecting the backreaction of the collision on the geometry of 
the bubble interior, the distribution of ring sizes was found in Ref. [22] to be 



^^^^^^ 47rAff^^(^-^J n;^'^sm{e„n), (2) 

where 6>crit is the angular radius measured from the center of the disc to the causal boundary and the other 
quantities are as defined in Eq. [ij 

An overall modulation of the background fluctuations: We assume that the temperature fluctuations, including 
the effects of the collision, at a location on the sky n can be written as [201 132 

^ = (l + /(n))(l + 5(n))-l, (3) 

where /(n) is the modulation induced by the collision and ^(n) are the temperature fluctuations induced by 
modes set down during inflation. This is motivated by the observation that the main effect of the bubble 
collision is to slightly advance or retard the inflaton inside our bubble. The modulation is multiplicative under 
the assumption that the normal inflationary density fluctuations simply "paint" the perturbed surface of last 
scattering and have identical statistical properties in both the regions affected and unaffected by the collision. 

Long-wavelength modulation: A collision is a pre-inflationary relic. The effects of a collision inside the causal 
boundary are stretched by inflation, and so we can expect that the relevant fluctuations are large-scale. As 
we describe below, this implies that the temperature modulation due to a collision centered on the north pole 
(^ = 0) has the form 

/(n) = (co + ci cos + 0{cos^ ^))6(^crit - 0) , (4) 

where the q are constants related to the properties of the collision, 6 is the angle measured from the center of 
the affected disc, and 0(^crit is a step function at the causal boundary 6>crit- Truncating the sum at (9(cos^), 
the constants cq and ci can be expressed in terms of a central amplitude zq and edge discontinuity 2;crit* 

^crit ~ ^0 COS^crit ^0 ~ ^crit /r\ 

1 - COS Ocrit 1 - COS l9crit 



We return to this point in Sec. VC when discussing the Bayesian framework for testing bubble colhsion models. 
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FIG. 1. The radial temperature modulation Eq. HI induced by a bubble collision centered on the the north pole = 0). 




FIG. 2. A Poincare-disc representation of the surface of last scattering inside our parent bubble, with one dimension suppressed. 
The future light cone of the collision at this time is denoted by the dark red line, with the shaded region representing the portions 
of the surface of last scattering that are to the future of the collision. Our past light cone at last scattering is represented by the 
dashed circle. From the present bounds on curvature, the size of our past light cone must be much smaller than one curvature 
radius. Zooming in on the portion of the surface of last scattering that we have causal access to (inset), the universe is very 
close to flat, and the region affected by the collision has approximate planar symmetry. The region affected by the collision 
appears as a disc of angular radius ^crit on the CMB sky. 



as shown in Fig.[T] Allowing the collision to be centered on an arbitrary location {^o, ^o} on the celestial sphere, 
the induced temperature modulation can be expressed as a function of five parameters: {zq, ^cnt, ^cnt, ^o, </>o}- A 
modulation of this form was first derived in Ref. [20 , where it was obtained from the observed modulation of a 
field representing the inflaton inside our bubble, numerically evolved in a background thin-wall bubble collision 
geometry. These authors did not predict the existence of a temperature discontinuity ^crit- While further work 
is needed to better predict the precise form of the template, in our analysis we allow bubble collisions to produce 
modulations with and without discontinuities. 

In Fig. [2j we show the Poincare-disc representation (see Ref. [14 for the details of this construction) of the surface 
of last scattering inside our parent bubble. The collision affects the shaded portion of this surface. The observed CMB 
is formed at the intersection of our past light cone (dashed circle) with the surface of last scattering, which in this case 
includes regions both affected and unaffected by the collision. From the underlying azimuthal symmetry, the collision 
appears as a disc on the observer's CMB sky. Zooming in on the neighborhood of our past light cone (inset), we can 
treat the universe as being flat. In addition, because we have causal access to much less than one curvature radius at 
last scattering (again from the observational bound on Clk), the collision has an approximate planar symmetry. 

The collision introduces pre-inflationary inhomogeneities into our bubble. The exact nature of these inhomogeneities 
depends on the specific model underlying the formation of our bubble and the subsequent epoch of slow-roll inflation, 
as well as the specifics of the collision. In dramatic cases, the collision ends slow-roll inflation everywhere within its 
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future light cone [18] , induces the transition to another vacuum state [23l [40l [4T] , or produces a post-cohision domain 
wah that eats into our bubble interior pT| [T9]. These scenarios are obviously in conflict with observation, and we do 
not consider them further. In mild cases, which will be our focus in the remainder of this paper, collisions satisfy 
the "compatibility" criterion defined above: the observable portion of the surface of last scattering is only minimally 
disturbed by the collision. Thin- wall analysis [17^ and numerical simulations [18, 20 indicate that it is indeed possible 
to find situations where the effects of a collision are compatible with our observed cosmology. 

The disturbance caused by a collision is a pre-infiationary relic and thus is stretched by the period of infiation 
inside the bubble. From the current bound on curvature [36 , we can infer that our past light cone encompasses less 
than one horizon volume at the onset of infiation. This implies that the initial disturbances caused by a collision, 
which is smeared out on the scale of the infiationary horizon after a few e-folds of infiation, has a wavelength today 
that is larger than the current horizon size. Together with the planar symmetry of the collision at last-scattering (by 
convention along the y-z plane), this implies that we can Taylor-expand the Newtonian potential (see Ref. [26 for 
a translation between the Newtonian potential and the originally postulated temperature modulation presented in 
Ref. [20]) about the causal boundary of the collision at x = Xcrit as 

*^coll = ^{o) (co + Ci(x - Xcrit) + 0{{X - Xcrit)^)) ©(^ " ^crit), (6) 

where ^{a) encodes the evolution of the potential with scale factor a and the q are model-dependent constants.]^ 

There are contributions to the observed temperature modulation from the Sachs- Wolfe effect, the integrated Sachs- 
Wolfe effect, and a Doppler effect (coming from the induced bulk peculiar velocity v of the fiuid in the region affected 
by the collision): 

f -5^+2£.a^^ + (v.fi + 0(.^)), (7) 
where is the scale factor at last scattering, a = 1 today, and 

V (X V<I>coll + Ci^^^coW- (8) 

da 

To leading order in v, the temperature induced by the collision is linear in <l>coii and its derivatives. Therefore, 
since x = x\^cos9 (where x\^ is the comoving distance out to which we can see on the surface of last scattering), the 
temperature fiuctuations induced by a collision are generally of the form Eq.[4] Further, even if the Newtonian potential 
is continuous across x = Xcrit , the resulting temperature fiuctuations need not be continuous across the causal boundary 
at ^crit- This discontinuity arises from the ISW and Doppler contributions to the observed temperature fiuctuation. 
Effects that we have neglected, including the finite thickness of the surface of last scattering and uncertainties about 
how the perturbations caused by a bubble collision propagate through our bubble interior, are encapsulated by the 
higher order terms in Eq. |4| These effects could smear out the causal boundary enclosing the collision on sub-degree 
scales. These corrections could be incorporated into our analysis as theoretical understanding improves. 

Given a specific model for the scalar fields making up the bubbles and driving eternal infiation, the kinematics of a 
particular collision, and our position inside our bubble, it is in principle possible to determine the free parameters in 
Eq. |4| Treating the colliding bubbles in the thin- wall approximation, some measure of the strength of a collision can be 
specified in terms of the vacuum energies in the bubbles, wall tensions, and kinematics as in Ref. [17 . The kinematics 
will induce a probability distribution for the free parameters in Eq. [4] However, an accurate treatment requires a 
calculation of the back-reaction of the collision on the behaviour of the infiaton inside our bubble. Preliminary work 
in this direction has been done [I8l |20l |23l |40] , providing a handful of examples. However, a systematic investigation 
has not yet been performed. This is distinct from the case where an ensemble of field theory models is considered, 
representing the string theory landscape. In this case, the fundamental parameters governing the structure of the 
colliding bubbles (wall tensions and vacuum energies) and the properties of the inner-bubble cosmology (including the 
number of infiationary e-folds etc) are drawn from some probability distribution. This again will induce a probability 
distribution for the free parameters in Eq. |4j whose nature is presently poorly understood. 

What would a bubble collision embedded in a CMB temperature map look like? In Fig. [3] we show a large-amplitude 
collision with and without background CMB fiuctuations. In the following sections, we apply the various stages of our 
analysis pipeline to this example to illustrate the algorithm. We make extensive use of such simulations in calibrating 



^ We are modeling the collision as a collection of super-modes truncated at the causal boundary, and our treatment is therefore very similar 
to the so-called "tilted universe" scenario [421 143| . The important distinction in the case of bubble collisions is that the perturbation 
vanishes at the causal boundary Xcrit- Because the collision entered our past light cone only relatively recently, we are still comoving 
with respect to the undisturbed FRW foliation, and the cancellation of the dipolar temperature modulation seen in Ref. [42ti44| does 
not occur. 
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FIG. 3. On the left, we show a bubble collision template with {zq = 5.0 x 10 ^,2;crit = —5.0 x 10 ^,^crit = 10.0°, = 
57.7°, 00 = 99.2°}. On the right we add simulated background fluctuations, smoothing, and instrumental noise. 



our analysis pipeline, and the details of their construction are presented in Sec. IV Although there could conceivably 
be many overlapping collisions, the predicted observational signatures of this scenario have yet to be explored, and 
we focus on simulations of distinct individual bubble collisions. Again, as theoretical understanding improves, our 
analysis could be extended to include the possibility of overlapping collisions. 

What would the detection, or absence, of a bubble collision tell us about the underlying theory of eternal inflation? 
To examine what the answer to this question might be, let us make some further assumptions about the temperature 
modulations caused by a bubble collision. First, assume that the potential induced by the collision (Eq.[6| is composed 
mostly of a single long- wavelength mode of physical wavenumber k. Second, assume that the Sachs- Wolfe effect is 
the dominant contribution to the observed temperature modulation. Under these assumptions, the amplitude of an 
observed temperature modulation is: 

2 k 

^0 - Q 7r^(^ls) (1 - COs6>crit) , (9) 
6 Ho 

where ais is the scale factor at last scattering. If the initial wavelength of the disturbance was of order one inflationary 
Hubble length k ~ Hj (since any fine-structure in the collision would be smeared within the first few e-folds of 
inflation), then <l>(ais) = ^(a = 0), and the physical size of such a mode at last scattering is given by 

k - nl^^Ho. (10) 

In this case, we have 

zo:^^(0)^^^/'(l-cos^crit). (11) 

If a bubble collision is detected, and a similar set of assumptions is proven correct in a specific model, the measured 
values of zq and ^crit allow one to infer the value of In the absence of a detected collision, Eq. Ill] can be turned 

1 /2 — 

into a bound on a combination of QjJ and <l>(0): 

^fc^'^(O) < [^O/ (1 - cos ^crit)]observational upper bound (12) 

This analysis can be recognized as an example of the Grishchuk-Zel'dovich effect in an open universe [42j[45]. 

Determining the detailed properties of the theory underlying eternal inflation through the observation of bubble 
collisions is likely to be a messy business. However, any model will predict an expectation value for the number of 
observable bubble collisions TV, making this a very useful phenomenological parameter. Any constraints on TV from 
data will also yield interesting information about our parent vacuum through Eq. [l] The most naive application of 
such a constraint, where we have evidence that TV > 1 or TV < 1, would yield the inequalities 

XH^^ < (^o detected collisions), XHp^ > ^-^ (collision detection), (13) 



^ The values of that one might be able to infer are conceivably below both the observational bound ^ .0084 and the theoretical 
observational bound fl^ < 10~^. For example, assuming zq ~ 10~^ and ^{0) ~ 1 (since the collision involves a relatively large release 
of energy), if a collision were observed at large angular scale (where cos^crit ~ 0), we can infer that ^ lO"-*^*-*. This implies that a 
collision is in principle observable even when curvature is not. We thank Lam Hui for elucidating this point. 
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These bounds would be most useful if we detect and/or B-mode polarization (the amplitude of which can be 
related to Hj) in future data. In the most optimistic scenario]^ if primordial B-mode polarization is detected by the 
Planck satellite, we can infer that Hj ~ 10^^ — 10^^ GeV. Further, if curvature is detected at the level Qk ^ 10~^, 
^ would be bounded from above or below by ~ 10^^ GeV^. The condition for eternal inflation 
must be consistent with this inequality. For example, assuming a Planckian 

could be bounded from above or below by 



then in Eq. 



13 



is XHp. < 1. Any application of Eq. 
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false vacuum energy {Hp ^ 10^^ GeV), the nucleation probability XHp^ 
~ 10~^, remaining consistent with the condition for eternal inflation. 



III. SUMMARY OF THE ANALYSIS PIPELINE 



Before providing a detailed description of our analysis pipeline, we motivate and summarize its various components. 
Eternal inflation can arise from a wide range of inflationary potentials, each producing a different expected number 
of detectable collisions on the CMB sky, TVg. We will therefore use TVg as a continuous parameter that characterizes 
particular models of eternal inflation. The standard cosmological model is given by the special case in which TVg = 0. 
Our primary goal is to determine, given the WMAP 7-year data, what constraints can be placed on TVg and whether 
models predicting TVg > should be be preferred over models predicting TVg = 0. 

The optimal approach to achieving this goal would be to construct the full posterior for from Bayes' theorem 
given full-resolution CMB data on the whole sky. Unfortunately, this would require inverting the full-sky full-resolution 
CMB covariance matrix as well as integrating the bubble-collision likelihood over a many- dimensional parameter space. 
These tasks are computationally intractable. However, taking advantage of the fact that bubble collisions produce 
discrete localized effects on the CMB sky, it is possible to approximate the full-sky Bayesian analysis by a patch-wise 
analysis if the most promising candidate signatures can be identified in advance. The implementation of such an 
approximation scheme requires two assumptions. First, we assume that the likelihood of models predicting TVg > 
is peaked in the regions of the sky containing the candidate collisions, and that the integral over the likelihood can 
therefore be estimated by concentrating on these regions, which make the largest contribution to the full integral. 
Second, we assume that these regions are separated widely enough to be uncorrelated with each other, so much 
smaller local covariance matrices can be used. These assumptions allow the results of a small number of localized 
(and therefore computationally-feasible) Bayesian model selection tests to be combined into estimates of the required 
full-sky statistics. Put simply, our algorithm implements a conservative approximation to the required numerical 
integral. A complete treatment of the full-sky analysis and the assumptions on which it is formed can be found in 
Appendix [Aj In addition, once a set of candidates have been identified, it is possible to apply further tests of the data 
in parallel. 

The full-sky approximation necessitates the development of an algorithm that identifies the most promising regions 
of the CMB sky and then processes them individually. Upon segmenting the full data set, it is important to avoid 
biasing oneself with a posteriori selection effects [34 , and it is therefore critical to minimize human intervention in 
choosing what portions of the sky to analyze. Thus our analysis pipeline is fully automated, tested and calibrated 
on realistic simulations of the data, and frozen before being applied to the real data. The final pipeline contains 
no algorithmic choices tunable via human intervention. As discussed in Appendix [Aj missing a bubble collision 
candidate which makes a significant contribution to the full integral leads to a conservative bias towards models 
predicting iVg = 0. This alleviates the worry that selection effects might lead to a spurious detection. 

Our analysis pipeline consists of a candidate identification step, followed by two parallel verification procedures: 

• Blob detection: To begin, we attempt to locate the most promising candidate signals using wavelets. Wavelet 
analysis is a compromise between working purely in position or harmonic space, and therefore yields information 
both about the location and angular scale of particular features in the temperature map. Specifically, we employ 
standard |46H5Q] and Mexican [5Tj spherical needlets, two classes of wavelets defined on the sphere. The statistics 
of the needlet representation of a purely Gaussian CMB temperature map (expected in the absence of a bubble 
collision at large scales where WMAP is cosmic- variance-dominated), combined with simulations of a bubble- free 
masked CMB sky, can be used to quantify the significance of various features. A set of significance thresholds 
are then defined to ensure a manageable number of "false detections" in the end-to-end simulation of the WMAP 



experiment (see Sec. IV A). Regions of the sky passing these thresholds are sewn into "blobs," whose size and 



location is determined by the needlet responses, and passed onto the next stages of the pipeline. 



In models where a collision is expected to be in our past, there might be good reason to expect a correlation between observed S-modes 
and the observation of a bubble collision |T8]. This is because large-field models of infiation, which generically predict a larger value for 
the tensor to scalar ratio, are much more robust in the presence of a bubble collision. In models of small-field infiation, a bubble collision 
can end infiation everywhere in its future light cone, implying that collisions in such models are not compatible with our observed 
cosmology. 
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• Edge detection: Once a set of candidate signatures is found, we look for circular edges across which the 
temperature is discontinuous. As discussed above, such causal edges are expected to be a generic signature of 
bubble collisions. We use the Canny algorithm [52 , adding an adaptation of the Circular Hough Transform 
(CHT) [53 , to focus our search on circular edges. The algorithm consists of identifying the most likely centre for 
a noisy circular edge. The significance of this response is calibrated from a detailed analysis of bubble collision 
simulations including cosmic variance, spatially- varying WMAP instrumental noise, and smoothing due to the 
instrumental beam. We verify that this step produces no false detections in the WMAP end-to-end simulation. 

• Bayesian parameter estimation and model selection: The regions highlighted by the blob detection step 
can be used to construct an approximation to the full-sky posterior probability distribution for TVg using the 
methods outlined in Appendix [Aj We first perform a pixel-based evaluation of the likelihood and Bayesian 
evidence in each blob for bubble collision templates of the form given in Eq. [4j sampling the parameter space 
using the nested sampler Multinest [54 . The likelihood analysis includes cosmic variance, spatially varying 
WMAP instrumental noise, and the smoothing due to the instrumental beam. Combining the evidences from 
each blob we obtain the posterior probability distribution for TVg, which is used to derive constraints on 
and perform model selection to determine if a theory with iVg 7^ is preferred over a theory with no predicted 
collisions. The significance of a detection is again calibrated from an analysis of simulated collisions and an 
end-to-end collision- free simulation of the experiment. 

The most important output of our pipeline is the approximation to the full-sky posterior probability distribution 
for iVg. This allows us to derive marginalized constraints on iVg, and perform model selection between theories with 
TVs = and TVs 7^ 0- Iii addition, for each blob identified by the first set of the pipeline, we obtain a set of marginalized 
posterior constraints on the model parameters {zq, ^crit, ^crit, 6>o, ^0}, a maximum needlet significance, CHT score, and 
a local Bayesian evidence ratio with respect to the no-bubble-collision model. 

IV. SIMULATIONS 

Our analysis pipeline is general, but each step must be calibrated using simulations of the particular data-set under 
consideration, in this case the WMAP 7- year data release [32]. WMAP has measured the intensity and polarization of 
the microwave sky in five frequency bands. The resolution of the instrument in each band is limited by the detectors' 
beams, and is highest at 0.22° in the 94 GHz W band. We perform our analysis on the foreground-subtracted W- 
band WMAP temperature map, as this combines the highest resolution full-sky data currently available with the least 
foreground contamination. To minimize the effects of the residual foregrounds we cut the sky with the conservative 
KQ75 mask, leaving 70.6% of the sky unmasked. 

We carry out extensive simulations to quantify the thresholds at which areas of the sky are passed from one step 
to another. To find the best approximation to the full-sky Bayesian analysis, we process as much of the sky as is 
computationally feasible. 

To determine the response of our pipeline to bubble collisions over the range of possible parameters, we generate 
simulations containing a variety of bubble collisions plus CMB, realistic noise and Gaussian beam smoothing. However, 
we also wish to ensure that we have a method to guard against systematic effects (e.g., foreground residuals and any 
map-making artifacts that may be present) that we do not have capability to simulate. These effects might lead to 
false detections in the "blob detection" stage, or critically, the edge detection and Bayesian analysis stages. It is 
impossible to claim a detection without first ensuring that there are no such false detections due to systematics. 

A. WMAP end-to-end simulation 

A realistic simulation of a WMAP-quality data-set that does not contain a bubble collision is an important tool for 
calibrating and quantifying the expected false detection rate of our analysis pipeline when applied to data which may 
include systematics (such as foreground residuals) that are not captured in our simulations or likelihood function. 
For this purpose we use a complete end-to-end simulation of the WMAP experiment provided by the WMAP Science 
Team[^ The temperature maps in this simulation are produced from a simulated time-ordered data stream, which 
is processed using the same algorithm as the actual data. The data for each frequency band is obtained separately 
from simulated sources including diffuse Galactic foregrounds, CMB fiuctuations, realistic noise, smearing from finite 
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FIG. 4. The two locations chosen for our simulated bubble collisions are over-plotted on the WMAP 7- year noise variances 
with the KQ75 7- year mask applied. The regions encompassed by the 5° and 10° simulated collisions are shaded black and grey 
respectively. Bubbles are centered in an unmasked high-noise {^o = 56.6°, 00 = 193.0°} (left) and low-noise {^o = 57.7°, 0o = 
99.2°} region (right). 

integration time, finite beam size, and other instrumental effects. In our analysis, we utilize the foreground-reduced 
W-band simulation. 



B. Simulated bubble collisions 



The temperature fluctuations observed in the CMB, including the effects of a bubble collision (originally found in 
Refs. [20l|39]), can be written as 

5T{n) = K(l + /(n))(l + 5{n)) - To],^„„,hed + ^I^noise(A) , (14) 

where n = {^, <p} is the position on the sk}|^ Tq is the average temperature of the map including the modulation, Tq is 
the average temperature without the modulation, STnoise is the contribution from instrumental noise, /(n) and S{h) 
are defined as in Eq. [s] The quantities in the brackets are smoothed with a Gaussian beam of 0.22° (approximating 
the beam size of the WMAP experiment in the W band). We use the WMAP best-fit 7- year power spectrum [55] in 
the multipole range 2 < i < 1024 to generate fluctuation maps ^Tsyn(n) = TQ(5(n) at the full WMAP resolution of 
^side = 512 (with 3,145,728 pixels). The noise term STnoise is generated from WMAP 7- year noise variances at the 
same resolution. Since the templates we consider add a relatively small temperature excess/deficit in one location, 
the features do not cause the power spectrum to deviate from that measured by WMAP [20^. Additionally, we can 
replace Tq ~ Tq, which gives 



ST{n) = [(1 + /(n))(To + (5T,y„(n)) - Tq] 



STnoiseii>)- 



(15) 



smoothed 

We consider collisions with ^crit = 5°, 10°, 25° and choose centers in a high-noise region (^o = 56.6°, = 193.0°) 
and a low-noise region (^o = 57.7°, (/)o = 99.2°) of the sky that remain significantly outside of the main body of the 
WMAP KQ75 7- year mask. The regions of the sky affected by 5° and 10° collisions are over-plotted in Fig. [4] on a 
masked map of the instrumental noise variance. For each 6>crit and location, we generate 35 simulated collisions with 
parameter values logarithmically spaced in the ranges 10~^ < zq < 10~^ and — 10~^ < Zcdt ^ — 10~^. The response 
of our pipeline depends only on the absolute value of zq and 2:crit7 so the choice of sign for zq and Zcrit is arbitrary. 
We repeat this for three realizations of the background CMB fluctuations, yielding a total of 210 simulated sky maps 
for each of the three collision sizes. 



V. ANALYSIS TOOLS 

We now describe in detail the analysis tools which make up our pipeline and how they are calibrated with simulations 



before being applied to the data. Readers wishing to skip these details may wish to study Figs. 11 and 17 and turn 



These angular positions can be expressed in terms of Galactic coordinates through longitude I = (p and latitude b = 90° 
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to Sec. |VD| for a summary of the outputs of the pipeUne at each stage, and the conditions under which a detection 
can be claimed. 



A. Needlets 



Wavelet analysis is a powerful tool for identifying features localized on the sky, the type of signal expected for 
bubble collisions. There exist families of wavelets that are defined on the sphere known as standard [46-50 and 
Mexican [51] spherical needlets. These functions form what is known as a "tight frame," allowing for a well-defined 
forward and reverse needlet transform. As in other forms of wavelet analysis, decomposing the temperature map into 
a sum over such functions yields information both about the location and angular scale of specific features. For a 
purely Gaussian temperature field, the statistical properties of the needlet transform can be straightforwardly related 
to the power spectrum, allowing a quantification of the significance of a possible detection. In addition, the spatial 
localization properties of the standard and Mexican needlets make it possible to avoid many of the problems associated 
with working on a cut sky. In this section we outline the properties of needlet transforms and analyze their utility in 
searching for bubble collisions. 



1. Definition of the spherical needlet transform 
The needlet transform is defined as: 

r(n) = ^/3,fc^A,fc(n), (16) 

where fi denotes a direction {^, 0} on the sky, j3jk are constant needlet coefficients, and ipjk{n) are the needlet 
functions. The members of this family of functions are labeled by the index k of the pixel at which they are centered, 
and their "frequency" j, which is related to the spatial extent of the needlet profile in real space. The sum in the 
needlet transform is over all pixels /c, and all frequencies j = 0,l,2,...,oo. For fixed j, there is one needlet coefficient 
Pjk for each pixel /c, allowing us to represent the needlet coefficients at fixed j as a map on the pixelated sky. The 
needlet functions are defined in terms of spherical harmonics Y^^(n) as 

i 

4>j,{h) = y^kJ2H(,B,j) Y;^{h)YUnk). (17) 

i m=-i 

Here, Xjk are the cubature weights, which are related to the area of each pixel. In the equal-area HEALPix pixeliza- 
tion [5^ we employ, all cubature weights are equal to Xjk = 47r/A/'pix, where N^^^ is the number of pixels, and we 
absorb this constant into the needlet coefficients Pjk- The function b{£,B,j) acts as a filter in harmonic space, where 
5 is a constant bandwidth parameter. It is chosen such that the family of functions form a tight frame (see 

e.g., Ref. [46,), which guarantees the existence of an inverse needlet transform given by 

= j T{h)^jk{n)dQ. (18) 

There are a number of possible choices for the function b{£,B,j), which distinguish the standard and Mexican 
needlets. A description of the explicit form of the function b{£,B,j) can be found for standard needlets in Ref. [46] 
and Mexican needlets in Ref. [51 . We plot 6 as a function of the multipole moments i in Fig.js] For standard needlets, 
b only has support for values of £ between B^~^ < £ < B^^^ . The bandwidth parameter B controls the width of each 
window function in harmonic space. Mexican needlets have support over all £ at each frequency j, and again have a 
bandwidth parameter B which controls the localization properties of the functions in harmonic space. 

In Fig. |6] we plot the wavelet functions in pixel space. As is to be expected, increasing the width of the function 
b{£^B^j) in harmonic space corresponds to improved localization in pixel space. In the limit of large j, there is an 
extremely small overlap of the needlet functions at nearby pixels. The compact support of b{£^ B^j) in harmonic space 
for standard needlets leads to slightly poorer localization in pixel space than is enjoyed by the Mexican needlets. 

If we decompose the temperature map into spherical harmonics: 



T(n) = ^a£^Y^^(n), 



(19) 
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FIG. 5. The filter function b{i,B,j) for standard (left) and Mexican (right) needlets with B = 2.5 for j = 0,1,2,3 (solid, 
dashed, dot-dashed, dotted). 




FIG. 6. Standard needlets in pixel space. On the left, we show standard needlets ipjk with B = 2.5 for j = 1, 2, 3 (dot-dashed, 
dashed, solid) at fixed /c as a function of the polar angle 0. On the right, we show standard needlets ipjk for fixed j = 3 at 
three pixels k (dashed, solid, dot-dashed) as a function of the polar angle (note: since we are projecting onto 0, the needlets 
appear asymmetric). 



then Eq. [Tsj together with the inverse transform 

aem = jT{n)Y;^dn (20) 

leads to: 

Pjk = \/Xjk^b{£,BJ) aimyimi'^k), (21) 

where are the spherical harmonic coefficients. This formula allows us to transform directly from the a^^s to the 
spherical needlet coefficients Pjk- In our analysis, the needlet transforms are accelerated by generating the a^^s at 
full WMAP resolution (A^side = 512) but limiting the reconstruction multipoles to 2 < < 256, and needlet positions 
k to the pixels at A^side = 128. This retains the resolution required to reconstruct features from half-sky to half-degree 
scales, encompassing the range of all detectable collisions. 



2. Needlet response to the bubble collision templates 

We now quantify the sensitivity of the needlet transform to the presence of a collision. In the absence of the 
background Gaussian fluctuations, we perform the needlet transform on a set of bare collision templates. As an 
illustration, in Fig. [t] we plot the spherical harmonic coefficients as a function of i (all coefficients for m ^ vanish 
by symmetry if we center the template on the north pole) for 25° and 5° collision templates, overlaid on top of the 
rescaled filter function b{i^B^j) for spherical needlets. The spherical harmonic coefficients for the collision templates 
peak at a value of i related to the angular scale of the causal boundary. Therefore, the needlet coefficients are largest 
at a frequency j that is directly related to the angular scale of the collision. This can be seen in Fig. [7| where the 5° 
collision has a maximum response at j = 3 and the 25° collision has a maximum response at j = 2. 

In Fig. [8| we plot the needlet coefficients for the 25° template at a variety of polar angles, for < j < 3. The 
needlet coefficients are largest in the center of the region affected by the collision (here, the template is centered on 
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FIG. 7. The spherical harmonic transform (connected dots) of a ^crit = 5° (left) and ^crit = 25° (right) bare collision template 
centered on the north pole on top of the filter function b{i,B,j) for standard needlets with B — 2.5 for j — 0, 1,2,3 (solid, 
dashed, dot-dashed, dotted). The overlap of the filter function b{i,B,j) with the spherical harmonic transform of the bubble 
template (see Eq. 21 ) determines for which value of j the needlet transform yields the largest signal. In these examples, the 5° 

3 and the 25° collision at j 



collision has the largest needlet response at j 
for the 25° collision is plotted in Fig. [s] 



2. The needlet response as a function of angle 




FIG. 8. 5 = 2.5 for standard needlets and a 25° collision, j = (circles), j = 1 (squares), j = 2 (diamonds), and j = 3 
(triangles) . 



^ = 0), and at a frequency j correlated with the angular scale of the collision {j = 2). As expected, the needlet 
response is sensitive to both the location and angular scale of the collision. 

By studying the needlet response to a variety of bare collision templates given by Eq. [4j we can find optimal values 
of the bandwidth parameter B for each needlet type. Larger-bandwidth needlets produce stronger signals, but also 
respond to a greater range of bubble sizes. We have found that the values B = 1.8 and B = 2.5 for the standard 
and B = lA for the Mexican needlets are a good compromise between signal strength and angular localization of 
response. In our analysis, we use this suite of three needlet transforms, ensuring that we are sensitive to temperature 
modulations with a variety of profiles, and allowing us to cross-check any candidate signals. 

As an important step in our analysis pipeline, we build lookup tables containing the possible range of bubble collision 
scales, ^crit,min ^ ^crit ^ ^crit,max5 to which cach uecdlct type and frequency is sensitive. We first generate a set of 100 
templates at each integer 6>crit between 1° and 89° by randomly sampling Zcrit between —5 x 10~^ < Zcrit < 5 x 10~^ 
with zq = 5 X 10~^ for Zcnt > 0, and zq = Zcrit + 5 x 10~^ for 2:crit < 0- This creates a set of templates with uniform 
total amplitude (i.e., constant fiO^cj)) — fiOcTit^cj))) but a variety of profiles at each angular scale. Next, we calculate 
the needlet coefficients for each of the three members of our needlet suite, and record the frequency generating the 
maximum central needlet response for each template. The range in ^crit recorded at each frequency is used to generate 
the lookup tables in Table [Tj 



3. Needlet coefficients on a cut sky 



The CMB is completely dominated by foreground emission in the region of the Galactic plane, and is also affected 
by bright point-sources. These issues are typically handled by applying a mask which covers the Galactic plane and 
known point-sources. The needlet transform can be applied directly on the masked temperature maps, and because 
the needlet functions are localized in pixel space, needlet coefficients far from the mask for sufficiently high frequency 
j are not significantly affected. These high-frequency needlets are mainly composed of high-^ spherical harmonics, 
and so cut-sky a^^nS can safely be used to calculate the needlet coefficients through Eq. [2l] Unfortunately, the low- 
frequency needlets are quite sensitive to the presence of the mask. To partially mitigate this sensitivity, we calculate 
the optimal unbiased maximum-likelihood estimators of the a^rnS [57| at low I. 
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TABLE I. Angular scale lookup tables for standard needlets with B = 2.5 (left), standard needlets with B = l.S (center), and 
Mexican needlets with B — lA (right). For a needlet frequency j, the needlet transform is sensitive to bubble collisions on 
scales ^crit,min < ^crit < ^crit,max. No rcsults are shown for the standard needlets with B = 1.8, j = as they have no support 
over the range of angular scales considered. 



Such maximum-likelihood estimators are computationally very expensive, and we must balance accuracy against 
limited computational resources. Another minor complication arises from the smoothing that is necessary to band- 
limit the data when performing the maximum-likelihood reconstruction algorithm of Ref. [57] : information leaks from 
inside the mask. Comparing the reconstruction on masked and unmasked simulated temperature maps using 10°- 
FWHM Gaussian smoothing, we have determined that a reasonably small bias is obtained when maximum-likelihood 
a^rn^ are used for < 10. 

This set of hybrid a^^nS - maximum-likelihood reconstructed a^^s for £ < 10 and cut-sky a^^nS for £ > 10 - is used 
in Eq. |2l]to calculate the needlet coefficients in the analysis that follows. 



4. Statistical properties of needlet coefficients 

For a Gaussian CMB without sky cuts, the statistical properties of the spherical harmonic coefficients [46^ are 

(a^^) =0, (la^mP) = Q. (22) 
These are related to the statistical properties of the j3jk in a straightforward way by 

(/3ifc> = 0, {/3|fc)= ^6(^,5, (23) 

which are identical at each pixel k. Thus, in a full-sky analysis, comparison with the Gaussian variance yields a 
measure of how likely it is to find a particular needlet coefficient in a purely Gaussian realization of the CMB sky. 

In the presence of foregrounds, however, it is necessary to work on a cut sky, which introduces a i- and /c-dependent 
bias. Following Ref. [48 , we determine the significance of a needlet coefficient on the cut sky byH 

^ ^ _ l/^jfe ~ (/^j/c) gauss, cut I ^25j 
\J (/^j/c) gauss, cut 



^ One can also define composite significances, involving needlet coefficients at multiple frequencies. An example is 

We have evaluated this statistic on a variety of collision templates modulating Gaussian realizations of the background CMB fluctuations, 
and found that it returns about half the significance given by Eq. l25| 
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FIG. 9. Needlet-coefficient variance maps for standard needlets with B — 2.5, j — 2 (top row) and Mexican needlets with 
B — lA, j — 11, generated using an ensemble of 3000 Gaussian CMB reahzations. The map-averaged variances of the full-sky 
maps (left column) agree well with expectations from theory (Eq. 23 predicts 774 /xK^ and 150 /iK^ respectively), as do regions 
of sky sufficiently distant from the 7- year KQ75 mask (right column). The low-frequency needlets are affected predominantly 
by the Galactic cut, whereas the high-frequency needlets are affected by point-source masking. 



where the average, (/^j/c) gauss, cut? s^nd variance, gauss, cut? 

are calculated at each pixel from the needlet coefficients 
of 3000 collision-free Gaussian CMB realisations with the WMAP 7-year KQ75 sky cut applied. Simulating only 
cosmic variance is sufficient here because the measurements made by WMAP are cosmic-variance-limited on the 
scales of interest, ^cnt ^ 5°. 

Maps of the needlet variances obtained from simulations are shown in Fig. [9] for an example with low (top) and 
high needlet (bottom) frequency. On the left are the needlet variances calculated without a mask, which agree at the 



5% level with the expected variances from Eq. 23 On the right are the masked needlet variances, which are clearly 
biased within a certain distance from the mask in both cases. Variances in the low-frequency example are affected 
predominantly by the Galactic cut. Variances in the high-frequency example are affected in a much smaller region 
of the Galactic cut (reflecting the increased spatial localization of needlets at high frequencies), but are much more 
significantly affected by the point source masks. 



To illustrate the expected response to a bubble collision, in Fig. 10 we show the temperature map of our illustrative 



example of a simulated bubble collision on the cut sky, and the significances of its needlet coefficients calculated from 



Eq. 25 at j = 3 using standard needlets with B = 2.5. The location of the collision is clearly highlighted in the map 
of needlet coefficients. The significance of the needlet coefficients in pixels in the center of the collision form a global 
maximum on the entire map. 

In order to identify a set of needlet coefficients with a particular feature, we sew regions with 5 or more pixels whose 
needlet coefficients exceed a frequency-dependent threshold into "blobs" (we discuss in more detail below how these 
thresholds are set). The core of each blob contains all adjacent pixels that pass the significance threshold. This core is 
then extended by first finding the average position no of the pixels in a blob and modeling it as a disc of radius A^/2 
(where is the maximum separation between any two pixels in the blob) centred on no. The blob is then extended 
to a radius of 1.1 x (6>crit,max + ^^/2), where ^crit,max is found from Table [l| (which is dependent upon the needlet 
type and frequency at which a feature is found) to ensure we include all related pixels. All pixels not contained in a 
blob are masked, and this new temperature map is passed to subsequent steps in the analysis pipeline. Eliminating 
irrelevant pixels allows us to drastically reduce the computational effort needed in the subsequent analysis. 
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FIG. 10. The temperature (top) and needlet coefficient signfficance (bottom) maps for a simulated bubble collision with 
zo = 5 X 10"^, Zcvit = -5 X 10"^, 6'crit = 10° on the CMB sky with the WMAP 7-year KQ75 mask applied. We show the 
map of needlet coefficients which gives the largest needlet response, in this case j = 3 for standard needlets with B = 2.5. The 
right-hand panels provide close-ups of the collision region. 
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TABLE II. The number of blobs found in the masked WMAP end-to-end simulation above significances ranging from 3 < 5* < 4 
for standard needlets with B — 2.5. 



5. Analysis of the WMAP end-to-end simulation 



As there are many independent needlet coefficients over the sky, it is inevitable that highly significant features will 
be detected in even a purely Gaussian CMB temperature map. In addition, residual foregrounds and instrumental 
artifacts may give rise to features which are mis-classified as candidate bubble collisions by the pipeline. To understand 
how these effects might contribute to the needlet significances, we ran the suite of needlet transforms on the end- 
to-end simulation of the WMAP experiment (described in Section IV) with the 7- year KQ75 mask applied. As an 
illustration of our results, in Table [ll] we give the number of blobs of varying significance found in the masked end- 
to-end simulation using standard needlets with B = 2.5. At increasingly high frequency, for which there are more 
independent needlet coefficients, more and more blobs are found that pass relatively large significance thresholds. 

We use the results of Table |ll] (and similar tables for other members of the needlet suite) to define a set of 
needlet frequency-dependent significance thresholds that allow a manageable number of false-positives, while retaining 
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TABLE III. Sensitivity thresholds Smin and the number of significant blobs detected in the end-to-end simulations A^biobs for 
standard needlets with B = 2.5 (left), standard needlets with B = 1.8 (center), and Mexican needlets with B = 1.4 (right). 
Only blobs that do not intersect the galactic cut are reported. No results are shown for the standard needlets with B = 1.8, 
j = as they have no support over the range of angular scales considered. 



feature blob B j S 

1 1 2.5 3 3.83 

1 2 1.8 5 3.55 

2 1 1.8 4 3.99 

3 1 1.8 5 3.28 

4 1 1.8 5 3.33 

4 2 1.4 10 3.77 

5 1 1.8 6 3.96 

6 1 1.8 7 4.13 

7 1 1.8 7 3.97 

8 1 1.8 7 4.34 

9 1 1.4 11 3.71 

10 1 1.4 12 4.14 



TABLE IV. Blobs found by the needlet transform in the WMAP end-to-end simulations with the 7-year KQ75 mask. 



sensitivity to a fairly large range of collision template parameters. The significance thresholds we use in our analysis 
are shown in Table [nij There are a total of 17 blobs in the masked end-to-end simulation that pass these thresholds. 
Comparing their locations on the sky, we can associate these blobs with 13 features (if a feature is picked up by 
multiple needlet types or frequencies, it can have multiple blobs associated with it). For three of these features, the 
set of pixels that pass the needlet threshold intersect the Galactic cut. We associate these with a response to the 
mask, and do not consider them further. For the other ten features, the needlet type and frequency which yielded the 



maximum significance is recorded in Table IV 



6. Analysis of bubble collision simulations 



To assess how robustly the needlet transform can pick out a collision in the temperature map, we have performed 
an analysis of the simulated collisions described in Section [IV| If the later steps of the analysis are to have a chance at 
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FIG. 11. Exclusion (black) and sensitivity (grey) regions for the needlet step of the analysis pipeline applied to a set of 5° (left) 
and 10° (right) simulated bubble collisions. If all realizations at the high and low noise locations yield a detection, we include 
the collision in the exclusion region; such collisions would certainly be detected as long as they were not significantly masked. 
If a detection is made in any realization/location, we include the collision in the sensitivity region; such collisions could be 
found if they were in a favorable location of the sky (i.e., low noise, or a region with a specific realization of background CMB 
fluctuations) . 



detecting a simulated collision, it must be contained within the set of pixels defined by the blob. To determine if the 
needlet analysis has detected a bubble collision, we therefore require that a blob exists which fully contains the region 
affected by the collision, and that the true center of the collision lies within the set of pixels passing the significance 
threshold. 

The results of this analysis for the 5° and 10° collisions are shown in Fig. [TT] We define the "exclusion region" of 
these plots as the part of parameter space for which all six realizations/locations yield a detection. If there were a 
bubble collision in the WMAP 7- year data with these parameters, it would be detected with high significance regardless 
of its location on the sky (as long as the collision was not significantly masked). The "sensitivity region" is defined 
as the part of parameter space for which any of the six realizations/locations yields a detection. A bubble collision 
in this range of parameter space would be detected only for a favorable location or realization of the background 
fluctuations. The exclusion and sensitivity regions for the 25° collisions are identical to those for the 10° collisions. 

Looking at the simulations in detail, there are a few general trends. First, for the needlets to pick out a collision, it 
is sufficient to have either a relatively large central amplitude zq or a relatively large temperature discontinuity Zcrit at 
the causal edge. This is clear from the shape of the exclusion region in Fig. [TT] From the size of the sensitivity region 
in this plot, one can also see that the amount of instrumental noise and particular realization of the background 
fluctuations can greatly affect the significance of the needlet coefficients in the vicinity of a collision. Many more 
collisions were detected in the low-noise region than the high noise region of the sky. For collisions in the exclusion 
region, there is a significant needlet response for all three needlet types over a range of frequencies, with an average 
significance exceeding 5 > 5. Collisions in the sensitivity region are typically detected by only one needlet type and 
frequency, with an average significance near 5* :^ 4. As our ability to detect 5°, 10°, and 25° collisions is nearly the 
same, we conclude that these results are fairly representative of our detection limits over all angular scales > 5°. 

These general trends can be contrasted with the response to features found in the end-to-end simulations. Here, 
blobs are typically detected with a single needlet type and are near the significance threshold (not surprisingly, as 
the threshold was chosen to have this property). A feature detected in the data by multiple needlet types and/or 
frequencies at a significance of 5* > 4 would be a good bubble collision candidate. However, we stress that many 
different underlying features could give rise to such a signal. The following steps in the analysis pipeline, which we 
now describe, are designed to verify if these candidates are in fact bubble collisions. 



B. Edge detection 



The first of the two parallel verification steps of the pipeline tests whether features highlighted by the blob detection 
stage have circular temperature discontinuities. The unambiguous detection of a circular temperature discontinuity 
would strongly suggest that a particular feature highlighted by the needlets is in fact a bubble collision. We employ 
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the Canny edge detection algorithm [52], whereby the gradient of an image is generated and thinned into single-pixel 
proto-edges, the best of which are stitched together into "true" edges. We also use an adaptation of the Circular 
Hough Transform (CHT) algorithm [53], which assigns a "score" dependent on how many edge pixels lie on circles of 
varying centers and radii. In this section, we describe our edge detection algorithm, and study its performance on an 
end-to-end simulation of the experiment, as well as on simulated bubble collisions. 



1. The Canny edge detection algorithm 

The Canny edge detector is the standard edge detection algorithm in image-processing software, and has recently 
been used to search for cosmic strings [30l |^. Designed as the optimal algorithm for localized, duplicate- free 
detection of edges within a noisy image, it uses three steps - smoothed gradient generation, non- maximal suppression 
and hysteresis thresholding - to extract contiguous edge sections. In Fig. [l2j we depict each of these three steps as 
applied to a temperature map containing a circular discontinuity; each of these steps are in turn described below. 

1. Smoothed gradient generation: The gradient of a Cartesian image is traditionally generated by convolving 
the image with two small symmetric filters, each determining one orthogonal component of the gradient. A 
number of simple filters - typically 3x3 pixels - perform the job adequately, but the optimally adaptable filters 
are the first partial derivatives of the two-dimensional Gaussian [52 . Using these filters is equivalent to first 
convolving the image with a small Gaussian filter (and thus smoothing out the effects of small-scale noise on 
the gradient calculation, an important step given the small number of pixels involved in the calculation) and 
then finding its gradient components. 

Unfortunately it is impossible to construct symmetric pixel-based gradient filters that cover the whole sphere. 
We therefore carry out both the Gaussian smoothing and gradient generation steps in harmonic space, making 
use of HEALPix's in-built alin2inap_der subroutine to calculate the magnitude and direction of the gradient 
at each pixel. We smooth with a Gaussian filter of FWHM 0.22° - approximately two pixels' width at the 
resolution of our input maps - to minimize features due to pixel noise while retaining longer edges. 

The gradient maps are generated before masking to reduce "ringing" from the sharp sky cut back into the map. 
The smoothing step ensures that any leakage from masked features is restricted to areas a few pixels within the 
sky cut, and affects only areas a few pixels outside of the cut. Nevertheless, any features found close to the 
mask should be carefully examined to check if they are generated from within the mask. 

2. Non-maximal suppression: The second step of the Canny algorithm reduces the smoothed gradients produced 
by the first step into local maxima. At this stage, all pixels are assumed to belong to a local edge, whose direction 
is defined to be perpendicular to the local gradient direction. Taking each pixel at a time, the two direct neighbors 
which lie closest to the local edge are found. The gradient magnitudes of the three pixels are compared, and 
the central pixel's gradient magnitude is set to zero unless it is the largest of the three. Processing each pixel 



in turn reduces the gradient magnitude map to only the local maxima (see the central panels of Fig. 12). 

As an example, consider the simplest case of crossing a sharp discontinuity along a perpendicular path. The 
gradient direction is constant at each pixel, whereas the smoothed gradient magnitudes increase until the edge 
is crossed, when they start to decrease. A non-maximal suppression algorithm tracks along the path setting all 
of the gradient magnitudes to zero apart from the pixel on (or closest to) the edge. 

3. Hysteresis thresholding: At this stage of the Canny algorithm, we have gradient magnitudes and directions 
for a set of local maxima of varying amplitude, some corresponding to true edges (which may have been disrupted 
by noise) and others to runs of noisy pixels or to more slowly- varying boundaries of CMB patches. To filter out 
true edges from the noise, and stitch together any edges that have been split, the final step of the algorithm 
takes advantage of the fact that, unlike randomly-oriented noise, true edges conserve their gradient magnitude 
and direction (to an extent affected by the shape of the edge, the pixelization scheme, and the noise level) over 
their path. 

Hysteresis thresholding involves first setting an upper threshold for the gradient magnitude: any pixels surpassing 
this threshold are considered to be part of true edges, and a new "true edge" map is created with these pixels' 
positions marked. A second, lower threshold is then set. Hysteresis thresholding then proceeds as follows: 

(a) The map is scanned until a true edge pixel is found. 

(b) The next potential edge pixel is defined to be the direct neighbor closest to 90° clockwise from the local 
gradient direction. The gradient direction of this pixel is compared to the current pixel's. To do so, the 
local phase angles to the current pixel's nearest neighbors are found, and used to define the neighbor closest 
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FIG. 12. An illustration of the Canny algorithm for edge detection. The input temperature map contains a circular discontinuity, 
which can be seen in a map of the gradient magnitude as a local maximum. Non-maximal suppression selects for local maxima 
in the gradient map. The hysteresis thresholding step finds stitched edges by comparing the local direction of the gradient at 
adjacent pixels. 



to the current gradient direction. The neighbors adjacent to this pixel are then determined. The gradient 
direction at the next potential edge pixel is required to lie between the phase angles of these neighbors. 
This rather loose requirement allows the algorithm to step along pixelated curved edges. 

i. If the two pixels' gradient directions match within the tolerance, 

A. If the neighbor's gradient magnitude passes the low threshold but not the high, it is considered 
to be part of a potential true edge. Its position is marked in a history array; then the algorithm 
"steps onto" this pixel and the process is repeated from step (b) until one of the other conditions 
is met. 

B. If the neighbor's gradient magnitude passes the high threshold, all pixels found on the way from 
the source pixel are confirmed as a true edge. Their positions are marked on the true edge map, 
and the algorithm returns to step (a). 

C. If the neighbor's gradient magnitude fails both thresholds, the edge is considered to be false: the 
history of potential edge pixels found on the way from the source pixel is erased, and the algorithm 
returns to step (a). 

ii. If the two pixels' gradient directions do not match within the required tolerance, 

A. If the neighbor's gradient magnitude passes the high threshold or the neighbor is already marked 
in the history array, all pixels found on the way from the source pixel are confirmed as a true edge. 
Their positions are marked on the true edge map, and the algorithm returns to step (a). This 
ensures simple branched and looped edges can be reconstructed. 

B. If the neighbor's magnitude does not pass the high threshold and the pixel is not already marked 
in the history array, the edge is considered to be false, the history of potential edge pixels found 
on the way from the source pixel is erased, and the algorithm returns to step (a). 

The entire process is then repeated, choosing the neighbor closest to 90° counter-clockwise to the local gradient 
direction in step (b). 



The end product of the Canny algorithm is a Boolean map of stitched true edges (see right-hand panel of Fig. 12). 
To reduce computation time, the pipeline from hysteresis thresholding onwards is restricted to the blobs produced by 
extending the regions passing the needlet significance test to ensure any discontinuity is fully contained. 

Care must be taken when setting the thresholds used in the hysteresis thresholding step. If either threshold is set 
too high, very few edges are confirmed. If the low threshold is set too low, a huge amount of potential edges are 
considered, and the algorithm runs extremely slowly. As the edges we could potentially find must be comparable in 
amplitude to the CMB signal and detector noise (as they have not yet been discovered by eye) and have been smoothed 
by the WMAP beam, we set low thresholds to err on the side of caution. Low and high thresholds of 30% and 40% 
of the maximum gradient magnitude found in each search region are found empirically to confirm edges generated 
in simulations in feasible computation timescales. This means that the gradients associated with the strongest CMB 



features - typically ~ 1° in scale - are classified as edges, as is shown in Fig. 12 
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FIG. 13. A depiction of the Circular Hough Transform (CHT). On the left is a Boolean map of edge pixels, as output by the 
Canny algorithm. Centering a pair of arcs oriented in the direction of the local gradient about each edge pixel, the CHT counts 
the number of times each pixel is intersected. The presence of a circular edge is indicated by a maximum in the CHT score - 
the hit count divided by the arc radius - as the arc radii are varied. On the left, we show a set of arcs, centered on four pixels 
on the circular edge we wish to detect; there will be no clear peak in the CHT score for this radius. Increasing the arc radius 
to match that of the circular edge (center), there will be a large number of hits at the true center of the edge. On the right, we 
show the actual map of the CHT score at each pixel for this example. As this has been scanned at the correct angular scale, 
there is a large peak at the center of the circular edge. 



2. Circular Hough Transform 

The maps of stitched candidate edges found using the Canny algorithm are processed using the Circular Hough 
Transform to search for the presence of circular edges. The basic idea, as shown in Fig. [Tsj is to count the number of 
intersections between circular arcs of varying radii centered on each of the candidate edge pixels and oriented along 
the local gradient direction. If there is a circular edge in the map, the number of intersections will be maximized at 
the center of the circular edge when the radius of the circular arc matches that of the edge. 

Assuming an edge pixel forms part of a circular edge of angular radius ^crit, one can define a prescription for the 
set of pixels that are the potential centers of the edge. The two most likely candidates in this set are the pixel ^crit 
away in the direction of the local gradient, and the pixel the same distance in the opposite direction; the edge could 
be a step up or a step down. Building in flexibility to cope with pixelation and noise effects, this set is expanded to 
two annular arcs of radius 6>crit7 oriented in the direction of the local gradient and centred on the edge pixel. 

The CHT works by assuming that all edge pixels are part of circular edges. The most likely circle center at a given 
radius is defined to be the pixel that is included in the greatest number of these arcs, counted using an accumulator 
array. If the search radius matches the radius of a circular edge within the map, we expect all of the circular edge 
pixels' arcs to include the true center, and the CHT accumulator will show a single clear peak. If the search and 
true radii are discrepant, fewer of the circular edge pixels' arcs will intersect, and this peak will appear as a ring 



with decreased amplitude (see Fig. 13). When the search and true radii are very discrepant, any rings will disappear 
beneath the background due to randomly-oriented noise and other non-circular edges. Note that "non-circular edges" 
will include the ~ 1° CMB fluctuations that are qualified as edges by the hysteresis thresholding step. 

To compare the CHT results at different search radii, one must divide out the approximately linear growth with 
angular radius of the number of pixels in each annular arc. We call this normalized quantity the "CHT score". The 
most likely center and radius of a circular edge within a map can therefore be found by scanning the map with the 
CHT at a range of radii and determining the maximum CHT score. 

The blob detection step provides the range of scales 6>crit,min < ^crit,i < ^crit,max of potential circular edges present 
in each blob. To determine whether a blob contains a circular edge, we compare the CHT scores obtained by scanning 
at every 0.25° increment within this range, using annular arcs that are 0.25° thick and which cover 45° of phase about 
each edge pixel. The annular arcs are therefore approximately two pixels thick, and are fairly wide to account for the 
effects of pixelation on the gradient direction. The thickness of the CHT annular arcs leads to an uncertainty in the 
CHT radius of 0.25° and position of 0.50°. If a circular edge is detected, we expect a clear peak in the CHT results 
for a particular blob. 

In Fig.jMjwe show the output of the edge detection algorithm on our illustrative example bubble collision simulation 



(see Fig. 10). On the left is the portion of the temperature map containing the collision. On the right we plot the CHT 
score in the pixels that passed the needlet significance threshold for ^crit,i = 10° (the input value). There is a clear 
peak at the location of the true center of the simulated bubble collision, which is 3 times the average response at 
other pixels. Since this feature was flagged in the blob detection step for standard needlets with 5 = 2.5 at j = 3, the 
range of radii scanned during the CHT step is determined from Table [l| to be 5° < ^crit ^ 14°. This range contains the 
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FIG. 14. The temperature map (left) and CHT score (right) for our illustrative example of a 10° bubble collision simulation. 
The CHT score is recorded at each pixel passing the needlet significance threshold. For a search radius of 10° there is a clear 
peak in the CHT score at the center of the simulated collision. 
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FIG. 15. The global maximum of the CHT score at each circle radius for the collision simulation shown in Fig. |14| The 
collision has a maximum response for standard needlets with 5 = 2.5 at j = 3, which from Table ^ sets the search range to be 
5° < ^crit < 14°. The peak of the CHT score at 10° clearly identifies the correct angular scale of the simulated collision. 



true radius ^crit = 10°. In Fig. 15 we plot the maximum CHT score found in the map for each circular radius, which 
contains a clear peak at the true radius of the causal boundary. This signal is a clear and unambiguous signature of 
a bubble collision. From visual inspection of the temperature map, it can be seen that we are able to clearly detect 
the edge even though the background fluctuations, noise and beam drastically reduce the sharpness of the observed 
temperature discontinuity. 



3. Analysis of the WMAP end-to-end simulation 



We expect strong circular edges to be extremely rare in a purely Gaussian CMB temperature map. However, it 
is possible that foregrounds, instrumental noise, the mask, and other experimental artifacts could lead to a spurious 
detection of a circular edge. To evaluate this, we have performed the edge detection step of our analysis pipeline on 
the features that passed the significance threshold in the WMAP end-to-end simulation (see Table IV) with the KQ75 
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mask applied. 

Comparing each feature in the end-to-end simulation with the bubble collision example studied above, the peak 
structure of the CHT score as a function of angle and morphology in pixel space are both drastically different. 
Examining the maximum CHT score as a function of circular radius, although there are several peaks, the clearest of 
which is shown in Fig. 16, their amplitude relative to the average score is nowhere near that of the collision example 
shown in Fig. 15 In addition, from the plots of the CHT score at each pixel, there are typically a number of fairly 
broad local maxima at different locations with approximately the same score. This is in contrast to the collision 
example of Fig. [Ml which yields a highly peaked score around a small number of pixels. 



4- Analysis of bubble collision simulations 



To better understand the response of our edge detection algorithm to the signal from a bubble collision in WMAP- 



quality data, we have analyzed the simulations described in Section IV We use as inputs the blobs found using the 
first step of the pipeline, and search for circular edges over the range of angular scales appropriate to the needlet type 
and frequency for each blob (see Table [l|. We conclude that a true causal edge has been detected if there is a global 
maximum for the CHT score at the radius of the true edge and the pixel with the highest score is within a typical 
CHT error (0.5°) of the actual center. We again present our results in the form of a contour plot denoting exclusion 



and sensitivity regions in the parameter space of zq and 2;crit- This is shown in Fig. 17 for simulated bubbles with 



^crit = 5° and 10°. The plot for the 25° collisions is identical to the plot for 10° collisions. 

Again, from the size of the sensitivity region, we conclude that our ability to make a detection is dependent on the 
location of the collision and the particular realization of the background CMB fluctuations. The exclusion region for 
the 10° (and 25°) collisions is far larger than for the 5° collisions. We attribute this to the proliferation of ~ l°-sized 
features in the background fluctuations, which can both disrupt a significant fraction of the edge pixels in a small 
collision and swamp the collision signal with their own strong gradients. We therefore expect our sensitivity to edges 
at small angular scales ^crit ^ 5° to be quite poor at WMAP resolution. As the performance of the edge detection 
algorithm for 10° and 25° collisions is identical, we conclude that the 10° results are fairly representative of our 
sensitivities over a wide range of angular scales 6>crit ^ 10°. Most of the collisions we mark as a detection have a clear 



peak in the CHT score of the type seen in Fig. 15 If a collision has parameters in the exclusion region, it would be 
reliably detected. Based on these results, the first two steps of our pipeline can detect bubble collisions with central 
modulations \zo\ > 3 x 10~^ and causal edges |^crit| ^ 3 x 10~^ at ^crit ^ 5° in WMAP-quality data. 
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C. Parameter estimation and model selection 



In many CMB anomaly analyses (but not all - see, e.g., Refs. [58H6Q ). the significance of a signal is quantified by 
calculating the frequentist P-value of some relevant statistic. This typically involves doing a large number of Monte 
Carlo realizations of the standard cosmological model (i.e., the "null hypothesis"), calculating the above statistic for 
each, and finding the fraction for which the statistic has a "more extreme" value than was actually observed. There are 
several problems with this approach. First, the calculated P-value is only a measure of how (un) likely the measured 
data were given the null hypothesis of the standard model; no comparison is made to an alternative model (which is 
what we are primarily interested in here). Second, the notion of "more extreme" is fundamentally ambiguous - both 
"more discrepant" (i.e., values of the statistic which are further from some fiducial expected value than that which was 
measured) or "less likely" are common choice^ The heart of the problem is that all such P- values are integrals over 
the likelihood, whereas it is only the likelihood of the actual data that is relevant. The fact that the likelihood and 
its integral generally have a similar qualitative dependence in the tail(s) of the distribution (i.e., both tend to zero for 
extreme values of the statistic) can mask this problem. In particular, if the tails of the likelihood are Gaussian then 
the integral that gives the P-value falls off more rapidly than the likelihood itself, and so the resultant P-values are 
unreasonably harsh on the null hypothesis. A related problem is that that many attempts to identify CMB anomalies 
using frequentist P-values are overly sensitive to a posteriori selection effects (see, e.g., Refs. [34j[61] for a discussion 
of this effect). Here the issue - that the statistics being applied to the data are often chosen on the basis of interesting 
features initially identified in the same data - is not intrinsic to frequentist methods (which, correctly, do not permit 
any data to be used more than once); but the need to invent a statistic from which to calculate a P-value can make 
it hard to avoid this trap. For these reasons we do not use P-values in our analysis. 

Instead, we adopt a Bayesian approach. Bayes' theorem provides a prescription for parameter estimation. In 
addition, given that we have two well-defined hypotheses, we can utilize Bayesian model comparison to make proba- 
bilistic statements about the degree to which the available data (and theoretical prior information) imply that bubble 
collisions have been observed. As shown by Ref. [62 , Bayesian methods are the only self-consistent framework for 
such calculations. The optimal Bayesian calculation would be to evaluate the likelihood of the entire WMAP data- 
set under the two models; however this not computationally feasible at present. In Appendix [Aj w e outline a set 
of simplifications that allow us to approximate the optimal Bayesian result. As outlined in Sec. |III[ we utilize the 
information on the location and scales of the most probable bubble collision sites obtained in the blob detection step 
of the pipeline to implement this procedure. Even this reduced problem is computationally demanding: analysis of 
the blobs detected in the WMAP 7-year data during the first steps of the pipeline requires three days' processing on 
28 cores. Working at full resolution is necessary to ensure that any possible circular temperature discontinuities are 
examined. 

These computational limitations also mean we are only able to process a limited number of simulated temperature 



For simple, single-peaked, likelihoods these two definitions are at least equivalent, but in some cases (e.g., a likelihood that is constant 
over some finite range) neither definition is satisfactory. 
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maps with and without bubble cohisions. The WMAP end-to-end simulation provides a great asset at this stage, giving 
the best possible measure of what false detections are to be expected from experimental effects and any systematic 
errors that are not included in our likelihood. We also analyze a small number of representative bubble collision 
simulations to obtain an estimate of the strength of signal we are looking for. 
We now describe our methods and the results from simulations in greater detail. 

1. B ay esian formalism 

A model of eternal inflation predicts the average number of collisions TVg that are, in principle, detectable by our 
pipeline on the full sky The ultimate goal of our Bayesian analysis is to evaluate the full posterior probability 
distribution for TVg, given a CMB data set d covering a sky fraction /sky. Using Bayes' theorem, this can be written 



Pr(iV,|d,/,ky) 



Pr(7V,)Pr(d|iVs,/sky) 
Pr(d|/sky) 



(26) 



The form of the posterior depends on the model prior Pr(7Vs) and the evidence (also known as the model likelihood) 
Pr(d|7Vs, /sky)- The evidence is defined by marginalizing the likelihood, Pr(d|m, TVg, /sky), over the n parameters 
describing a collision, as specified by the model m. Once the shape of the posterior has been determined, it is 
normalized using Pr(d|/sky). The posterior leads directly to constraints on the values of TYg consistent with a CMB 
data set. 

In a landscape scenario, TVg can be considered as a continuous parameter and the prior Pr(7Vs) would be determined 
from a measure over the possible values of TVg. We can also view TVg as a proxy for different models of eternal inflation 
(i.e., selecting a single value of TVg), as described further in Sec.[llj The standard cosmological model without bubble 
collisions is specified by the case TVs = 0. Using Eq. [26j the probability of a model which predicts TVg collisions (on 
average) relative to that of the standard cosmological model is 



Pr(7Vs|d,/sky) _ Pr(7V,)Pr(d|7V„/,ky) 



(27) 



Pr(0|d,/,ky) Pr(0)Pr(d|0,/sky) * 

The model priors and the evidence values play an equal role in this relationship, but in the absence of a detailed 
understanding of the former, it is often useful to proceed under the assumption that the two models are equally 
probable a priori. A theory predicting an expected collisions is favoured over the standard model when the 
relative probability on the LHS of Eq. |27| is greater than unity. 

It is also useful to provide heuristic conversions between the Bayesian evidence ratio and other commonly used model 
comparison quantities. The number of "sigma" of an anomaly statistic, Na, is often used to characterize the deviation 
from a null model, but it is unambiguously defined only in the case that the null distribution of the chosen statistic is 
Gaussian with zero mean. In such a case the probability of measuring an Na deviation is P{N) oc exp(— A/'^/2), which 
can be identified approximately with the inverse of the ratio in Eq. |27| so that, e.g., a 3a detection is comparable to 
a ratio of approximately one hundred. However we emphasize that both the number of sigma and related statistics 
such as Ax^ are of limited utility in the context of all but the most trivial model comparison problems. 

Computing Pr(d|iVs, /sky) by marginalizing over the likelihood for the full prior volume is an immense computational 
task, requiring the inversion of the full sky WMAP covariance matrix at full resolution and marginalizing over all 
possible numbers, locations, and sizes of collisions. However, taking advantage of the fact that bubble collisions 
produce discrete localized effects on the CMB sky, it is possible to approximate the full-sky Bayesian analysis by a 
patch- wise analysis if the most promising candidate signatures can be identified in advance. We describe in detail in 
Appendix [a] an algorithm to perform such a patch- wise approximation to this full multidimensional integral. 

The key ingredient is determining the regions of parameter space where the likelihood is significantly peaked, and 
hence gives the most significant contributions to the evidence. If these regions can be identified, the integral need only 
be performed over the restricted ranges to obtain an estimate of the evidence at greatly-reduced computational cost. 
We use the results of the blob detection step of the analysis pipeline to identify the patches which are likely make the 
most significant contributions to the integral. Assuming that the bubble collision model likelihood is peaked in the 
A/b detected blobs, we show in Appendix [A] that the unnormalized posterior can be approximated as 



Pr(iV,|d,/,ky)aPr(iV,)e- 



E 

N,=0 



(/skyiVs) 



E 

bi,b2,---,b]\ 



(28) 



The number of detectable sources Ns is a subset the total number of sources on the sky N (Eq.[l}. 
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FIG. 18. The normalized posterior Pr(iVs|A^b, /sky) (see Eq. A16) assuming /sky = 0.7. In the left panel, we show the posterior 
obtained for one blob Nb = 1 for a local evidence ratio pb = 4. Comparing with the posterior at iVs = (dashed line), we see 
that any theory predicting iVg < 4 will be preferred over the theory without bubble collisions. In the right panel, we show the 
posterior obtained for four blobs with identical local evidence ratios pb — 1/2. Again, comparing with the posterior at Ns = 0, 
any theory with iVs ^ 7 will be preferred over the theory without bubble collisions. When there are multiple blobs, the bubble 
collision hypothesis can be supported even when the evidence ratio for each blob is less than one. 



where the pre-factors reflect the fact that the number of collisions present on the observable sky, TV's, is the realization 
of a Poisson-like process (of mean /sky^s), and is the evidence ratio evaluated within a candidate collision region 
(with data sub- set d^) using a single bubble collision template 

_ Pr(d,|l) 

The posterior can therefore be built from local measures of how well the data are described by the standard model 



with and without a collision template. Once Eq. 28 is obtained in a particular case, it can be normalized, although 



this is not strictly necessary to perform the parameter estimation and model selection analyses. 



To illustrate some possibilities, in Fig. 18 we plot the normalized posterior assuming /sky = 0.7 (from the KQ75 
mask) and a uniform prior on TVg, for the case where there is a single detected blob (left panel), and four detected 
blobs (right panel). A theory predicting a particular value of will be preferred to a theory without bubble collisions 



if the ratio in Eq. 27 is larger than one. This amounts to comparing the posterior at some value of to the posterior 
at TVs = (dashed line). To prefer any theory with bubble collisions, in the one-blob case it is necessary for the blob 
to yield a local evidence ratio larger than one (here, we plot the posterior assuming p^ = 4). This is not true when 
there are multiple blobs, as can be seen in the right panel of Fig. [Tsj where we plot the posterior assuming each blob 
has a local evidence ratio pb = 0.5. The bubble collision hypothesis (for some values of TVs) is preferred even when the 
local evidence ratios are less than one: a number of marginal detections can be significant when considered together. 
We can also obtain any desired confidence intervals on by examining the shape of the posterior (although it is 
always the whole distribution that is the full answer to any parameter estimation problem). 

When the local evidence ratios are large, the posterior can be approximated by Eq. |A16| appropriately normalized. 
In Fig. fTol we plot the posterior in the limit of large evidence ratios (again assuming /sky = 0.7) for no blobs, two 
blobs, and four blobs. Even in the presence of large local evidence ratios, it can be seen that the posterior has a 
significant spread due to cosmic variance: we only have access to one realization of bubble collisions on the CMB sky. 
Note that this is true even when there are no detected blobs. When there are multiple decisively detected blobs, the 
posterior correctly assigns a very small probability to iVs = 0. 

Our analysis also provides constraints on the parameter values of each candidate collision. The constraints on the 
n template parameters m are encoded in their joint posterior distribution 

The marginal distribution of any subset of the parameters is given by integrating Pr(m|d5, 1) over the remaining 
parameters which are not of interest. For the bubble collision model the parameters should include both those 
describing the collision and the global cosmological parameters; marginalizing over the latter would give constraints 
on the properties of a (putative) detected collision. We now discuss the analysis of the likelihood and evidence ratios 
for a patch in greater detail. 
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FIG. 19. The full posterior Pr(iVs|A^b, /sky) (Eq. A16) that would be obtained from a conclusive detection (i.e., ph^ ^ 1) of 
Nh = 0,2,4 (solid, dashed, and dot-dashed curves) blobs containing bubble collisions assuming /sky = 0.7. The presence of a 
sky cut skews the distribution towards Ns > Nb. Note that even when features are conclusively detected, there is an intrinsic 
uncertainty in Ns; this is a form of cosmic variance. 



2. Analysis of candidate bubble collision patches 

At the heart of the above formalism for assessing the full posterior for TVg is the evaluation of the patch likelihood 
for a single collision, Pr(d5|m, 1). Here the data, d5, are the measured temperature values of the pixels in the vicinity 
of the detected blob that are not in the sky cut. The bubble collision model parameters, m, should include both 
those that describe the collision, {^^o, ^crit, ^crit, ^o, ^o}? as well as the cosmological parameters which determine the 
CMB power spectrum. However any plausible bubble collision would be sufficiently localized that the cosmological 
parameters are essentially uncorrelated with them; moreover they are sufficiently tightly constrained from CMB 
measurements that their uncertainties are minimal in the context of a template-matching problem like this. Hence 
we fix the cosmological parameters to their best-fit WMAP values [55 and only the bubble collision parameters are 
varied. Hence m = {^^o, ^crit, ^crit, ^o, ^o} for the bubble collision model, and there are no free parameters in the null 
model. Indeed, the no-collision model can be treated as a special case of the collision model in which the collision has 
zero amplitude. 

As both the CMB signal and the WMAP noise are Gaussian, the likelihood has the form 

Pr(d6|m,l)(xexp(^-ix') = exp |-i[db - t(m)]Tc-i[dfc - t(m)]| , (31) 

where t(m) is the temperature modulation caused by the collision and C5 is the pixel-pixel covariance matrix. The 
temperature modulation of the p^^ pixel is given from Eq. [sjas tp = 1 + /(np), where n is the position on the sky. The 
covariance matrix includes CMB cosmic variance, Gaussian smoothing approximating the WMAP W-band beam, and 
the pixel-dependent WMAP noise. The covariance between two pixels p and q with angular positions and is 
hence given by 

Cp,g = Np,g + V ^^CePeinp ■ n,), (32) 

where Ci is the best-fit WMAP CMB power spectrum convolved with a Gaussian beam of FWHM 0.22°, Pi{x) is the 
Legendre polynomial of degree and Np^q is the noise covariance between pixels. This is taken to be 

Np,, = ^,,1^, (33) 

where Sp^q is the Kronecker delta function, aw = 6.549 mK is the RMS noise of the W-band detectors, and A^obs,p is 
the number of times WMAP has observed the p^^ pixel. To preserve any edges, we must invert C5 at full resolution. 
Given available computational resources, the maximum area of the sky we can study at any one time is limited to 
patches of radius ~ 11° surrounding the center of each detected blob. 

The evaluation of the evidence integral Eq. |A12| and the full characterization of the posterior distribution of the 
parameters are both computationally challenging - even when restricted to small patches - as they require a large 
number of likelihood evaluations. In all but the simplest of cases it is fatally inefficient to evaluate the likelihood 
over a multi-dimensional grid and so a variety of sampling algorithms have been developed in which the likelihood is 
only evaluated in the high posterior regions that are of most interest. For both parameter estimation and evidence 
calculations we use the nested sampling algorithm [63^ as implemented in the publicly available MultiNest package 
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[54] . MultiNest performs numerical integration in order to estimate the evidence values; the required convergence of 
the integration can be adjusted to balance computation speed with accuracy. At the settings we use, the evidence 
values returned by MultiNest are accurate to ~ 10%. We use the getdist routine included in CosmoMC [64] to 
extract parameter estimates and uncertainties. 

The parameter prior Pr(m) in Eq. 30 is derived from theory, previous experimental results, and the limitations of 
the data-set and pipeline: it encompasses the full prior understanding of what defines a detectable collision. Because 
we lack a detailed theoretical prediction for the amplitude parameters in each template (as discussed in Sec. [llj ), 
we assume a uniform prior on zq and Zcdt over the ranges — 10~^ ^ zq < 10~^ and — 10~^ < 2;crit ^ 10~^, set by 
the observed temperature fluctuations in the CMB. Bubbles with larger values of these parameters would have been 
visible to the naked eye in any existing CMB data- set. Bubble collisions are expected to be distributed isotropically 
on the CMB sky, and so we assume uniform priors on the full ranges of {cos 7^0} to ensure that the probability 
of finding a bubble per unit area is constant across the sphere. Theory predicts that bubble collision radii should 
range from 0° to half-sky, but our pipeline's sensitivity is restricted by CMB power at small scales and computational 
requirements at large scales. The non-zero prior range for detectable bubble collisions is accordingly restricted, and 
we assume uniform priors on ^cnt values in the range 2° < 6>crit ^ 11°. 

To minimize computation time, the evidence integrals are only calculated over the parameter ranges within which 
the priors are non-zero and the likelihood is peaked. For each feature, the angular scale lookup tables (Table [l|) indicate 
the range of interest for ^crit- Merging all of the sets of significant pixels found for each feature yields the ranges for 
{Oo^(j)o}. As the needlets are equally sensitive to cold and hot features with varying profiles, little information about 
the ranges of interest for {2:o,^crit} can be extracted from the blob detection results, so the full prior volume must 
be considered. The accuracy of this procedure has been verified by performing the integral on the same patch of sky 
using different parameter ranges for {^cnt^ ^o^ ^o}- As long as the likelihood peak is encompassed by the parameter 
ranges given to Multinest, the returned evidence values agree to within numerical accuracy. 

For the fiducial collision example shown in Fig. [lO] our analysis yields an evidence ratio of Inp^ = 119.8 ± 0.3: 
the collision model is a very good fit to the data. The full-sky posterior would favour any theory predicting bubble 
collisions over a large range of TYg. The marginalized bounds on the parameters are compared to the input parameters 



in the first row of Table VI the agreement is excellent. However, to make a final judgment about a detection, we 



must ask what types of evidence ratios we get for false detections in the WMAP end-to-end simulation. 



3. Analysis of the WMAP end-to-end simulation 



We have performed the full Bayesian para meter estimation and model selection analysis on the blobs found in the 
WMAP end-to-end simulation (see Table IV). The total processing time for the full pipeline to run on this single map 
is on the order of 12 hours on 28 cores. Our results for the evidence ratios and marginalized parameter constraints 
for {zq, Zcrit^Ocrit^Oo^ (/)()} foT cach feature are recorded in Table [V| 

The evidence ratios for the features identified in the blob-detecti on ste p of the pipeline are all significantly less than 
one. We can therefore approximate the full posterior for TVg by Eq. |A20 , and rule out TVg > 1.6 at the 68% confidence 
level. The posterior is maximized at TVg = 0, and we therefore correctly conclude that the data from the end-to-end 
simulation does not warrant augmenting ACDM with bubble collisions. 

These results from the end-to-end simulation yield quantitative information on the degree to which systematics and 
foregrounds could mimic the signal from a bubble collision. Reassuringly, no features yield evidence ratios greater 
than one. To be distinguishable from systematics and foregrounds, we require the evidence ratios that we find for any 
feature to at least exceed the evidence ratios found in the end-to-end simulation at similar needlet frequencies. 



4. Analysis of bubble collision simulations 



The long processing time, even for a single map, prohibits us from running the Bayesian parameter estimation 
and model selection analysis on the full set of bubble collision simulations. We therefore choose a small number of 
representative examples from the set of simulated collisions passing the needlet significance threshold (drawn from 
the exclusion and sensitivity regions of Fig. [IT]). Six 10° collision simulations were chosen to sample distinct areas of 
our parameter space, specifically collisions with: 



Eq. [2] predicts that the angular scale distribution for all bubbles falling within our past light cone varies with sin^crit- However, this is 
derived under the assumption that collisions do not affect our bubble interior, and a more careful treatment might lead to a correlation 
between the values of zq, ^^crit? and ^crit- To retain consistency with our uniform priors on zq and ^^crit? we assume a uniform prior on 
^crit- Regardless, both choices for the prior lead to identical conclusions for the WMAP 7- year data. 
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TABLE V. Results of the Bayesian parameter estimation and model selection analysis for the WMAP end-to-end simulation. 
The ranges of ^crit are determined from the needlet response (see Table [l| . By computational necessity, the evidence integral is 
truncated at 11°. Reported error bars are at 68% CL. Angular quantities are quoted in degrees. 



1. a large central amplitude and edge; 

2. a small central amplitude but large edge; 

3. a large central amplitude but small edge; 

4. a medium central amplitude and medium edge; 

5. a small central amplitude and medium edge, and; 

6. a small central amplitude and small edge. 

The first two collisions lie in the CHT exclusion zone, the third in the needlets exclusion zone, and the others in the 
sensitivity region. All collisions were placed at the low- noise location to maximize the chance of a detection. 



The results of the Bayesian analysis of the collision simulations are displayed in Table VT The first example 
corresponds to the collision in Fig.[lO| and is clearly a highly significant detection with an evidence ratio of In 120. 
The second example is, again, an extremely clear detection, with In p5 136. While the evidence for the third example 
is numerically lower than for the strongly discontinuous cases, at In 29, it is again a conclusive detection. In 



each of these examples, the full-sky posterior assuming TV^ = 1 (which is well approximated by Eq. A18), would prefer 
models with bubble collisions over a wide range of N^. 

For the collisions sampled in the sensitivity region, the maximum needlet significance recorded in each case was 
around 5 4, which is on the upper end of the significances found in the end-to-end simulation: similar features 
in the data would be passed to the Bayesian analysis section of the pipeline. The evidence ratios were found to be 
ln/)5 :^ 9 for the collision with a medium central amplitude and a medium edge, Inp^ —1 for the collision with the 
medium edge but a smaller central amplitude, and lnp5 ~ —7 for the collision with a small edge. Since the latter 
two templates differ only by the value of ^crit^ this is further proof that the presence of a detectable causal boundary 
increases our ability to distinguish a collision. In addition, comparing examples 3 and 4, it can be seen that changing 
the central amplitude by a bit more than a factor of two yields an evidence ratio that is orders of magnitude larger. 
Apparently, there are rather sharply defined limits of detection. For these marginal cases, the parameter uncertainties 
are significantly underestimated due to the relative strength of the CMB and noise. Only in the case of the collision 
with a medium amplitude and medium edge could we conclude that models with bubble collisions are preferred over 
those without over a modest range in TVg. 

In conclusion, for the simulated collisions in the needlet and CHT exclusion regions of parameter space, our pipeline 
can clearly determine that the bubble collision hypothesis is favoured for a variety of N^. In the other cases we have 
studied, where the collision lies in the needlet sensitivity region, the conclusion is less clear. The evidence ratios are 
higher than most of those in the end-to-end simulation, but not much greater. They are also small in magnitude, and 
therefore do not yield full- sky posteriors that favor the bubble collision hypothesis. Thus, while we might rule these 
features out as being systematics or foregrounds, better data would be needed to definitively establish the bubble 
collision hypothesis. Furthermore, the bounds on parameter values in detections associated with the sensitivity regions 
of parameter space should be regarded as rough estimates only. Note also that since the data sets we consider for 
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TABLE VI. The input and marginalized 68% CL parameter bounds for the representative sample of simulated 10° collisions. 
All simulated collisions are located at Oq = 57.7°, 00 = 99.2°. Hatted quantities are estimates, and un-hatted quantities are 
inputs. No errors are quoted for the estimated central positions and radii for the cases where there was an extremely strong 
detection. This is due to the pixelization of the map: variations in collision- centre coordinates or radius of much less than a 
pixel's width will not affect the pixelated template, and hence will not affect the likelihood. Angular quantities are quoted in 
degrees. 



each blob are restricted to patches of the sky smaller than 11°, the gain in sensitivity that arises from the existence 
of a circular temperature discontinuity will not be present for modulations with 6>crit ^11°- For large features with 
an edge, the evidence ratios we obtain would therefore be an underestimate. 



D. Summary of the analysis pipeline and conditions for claiming a detection 



We now summarize the analysis pipeline and the interpretation of its outputs. First, the analysis pipeline segments 
the sky into "blobs," each of which corresponds to a region which, for some needlet type and frequency, passes our 
needlet significance threshold. A specific region of the temperature map can be covered by multiple blobs if there is 
a response for multiple needlet types/frequencies at the same location. The output of this first step in our pipeline 
is the location, size, and maximum significance associated with each blob. The edge detection step of our pipeline 
finds the CHT score as a function of assumed circle size and pixel. If there is a clearly peaked global maximum for 
the CHT score, this can be processed into the most likely circle center and angular scale. In parallel, we calculate 
the marginalized constraints on the parameters {2:0, ^crit, ^crit, ^o? 0o} and Bayesian evidence ratio for each feature. 
These evidence ratios are then used to construct the full-sky posterior Pr(7Vs|^b7 /sky) (Eq. 28) which is a function of 

The posterior allows us to put constraints on the possible values of that are consistent with the data. Comparing 
the value of the posterior at TVg = and some particular value of TVg specifies whether or not the ACDM model should 
be superseded by a model that also predicts on average TVg bubble collisions. If a large ratio of the posteriors is 
obtained, a conclusive detection of the bubble collision hypothesis can be claimed (provided a model that predicts an 
appropriate value of exists). A clear peak in the CHT score would indicate the presence of a circular temperature 
discontinuity in the CMB. This is a clear signature of bubble collisions, and would be nearly conclusive evidence for 
the eternal inflation scenario. We have found using simulations that a clear edge also yields large evidence ratios, 
indicating that these two tests are complementary. However, an edge is not necessary to verify the bubble collision 
hypothesis. There is a clear expectation obtained from the end-to-end simulation for the contribution from false 
detections due to systematics and foregrounds: the absence of a clear peak in the CHT score, and evidence ratios for 
each blob not exceeding In ~ —6.6 at detectable scales. 



VI. ANALYSIS OF THE WMAP 7- YEAR DATA 



Our analysis of the W-band WMAP 7- year foreground-reduced temperature map with the KQ75 mask produces a 
total of 38 blobs passing our needlet sensitivity thresholds. These blobs can be grouped into 15 distinct features, four 
of which either intersect or are within a few pixels of the main Galactic cut; these features are assumed to be responses 
to the mask, and we do not consider them further. The properties of the blobs belonging to the 11 remaining features 



are given in Table VII 



A number of these features have been noted previously. Feature 2 is at the same position as the famous Cold 
Spot ^65^. In addition, features 1 and 3 are coincident with the most significant hot spots identified in the needlet 
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TABLE VII. Features found by the needlet transform in the WMAP 7- year data. Features 1 and 3 correspond to the hot spots 
found in Ref. 48 ; feature 2 is the Cold Spot 65 . Angular quantities are reported in degrees. 



analysis of Ref. [48 . The number of features we have found is consistent with the results from the WMAP end-to-end 
simulation, although the simulation does not contain as many high-significance features at low j. In addition, the most 
significant features in the WMAP 7-year data generate responses from multiple needlet types at multiple frequencies 
(e.g., the Cold Spot is picked out by seven needlet frequencies), whereas features in the end-to-end simulation tend to 
be highlighted only by a single needlet. Interestingly, 9 of the 11 features identified as significant are in the Southern 
Galactic hemisphere. 

The CHT scores do not have a clear peak at any angular scale or location for any of the detected features. Indeed, 
the detailed outputs for the data are completely consistent with those obtained for the end-to-end simulation. The 
largest CHT peak found in the data is shown in Fig. 20 (which should be compared to the most peak-like feature found 
in the end-to-end simulation, shown in Fig. 16). We therefore find no evidence for circular temperature discontinuities 
in the WMAP 7-year data, and can rule out bubble collisions in the CHT exclusion region defined by simulated 



collisions shown in Fig. 17 



The marginalized parameter constraints and local evidence ratio for each of the features is recorded in Table VIII 
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Since we are limited to patches of the sky 11° in radius, the evidence ratios we have obtained for features whose ^crit priors extend 
beyond ~ 11° will be underestimated if a weak edge exists outside the patch of sky considered. 
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FIG. 20. The clearest peak found during the edge-detection analysis of the WMAP data. The contrasts in scores as a function 
of position (left) and radius (right) are comparable to those obt ained in the analysis of the end-to-end simulation (Fig. 16), 
and greatly reduced compared to the collision example (Figs. 14 and 15). 



Features 2, 3, 7, and 10 have evidence ratios significantly larger than those found in the collision-free end-to-end 
simulation (Inp^ ~ —6.6), specifically — 4.6. —4.1, —5.4 and —3.8 respectively. Assuming A^b = 4, and using these 



values for the local evidence ratios in Eq. 28, we find that the posterior is maximized at A^s = 0, and we can constrain 
iVg < 1.6 at 68% CL. One would need roughly lnp5 ~ —1 for each of the four features to prefer the bubble collision 
hypothesis for any value of N^. Therefore, the WMAP 7- year data does not warrant adding bubble collisions to 
ACDM. 

Although the local evidence ratios found for the WMAP 7-year data were not large enough to yield support for the 
bubble collision hypothesis, they are about an order of magnitude larger than what was expected from systematics 
based on the end-to-end simulation. The analysis of future data sets may increase the significance of these blobs if 
they are indications of bubble collisions, or else they will decrease in significance if they are not; in any case they are 
the most significant features on our sky, and thus take priority in being further investigated with better data. Thus, 
we now examine these four most significant features in more detail. The location of each of the four features on the 
sky is shown in Fig. |2l] A closer view of each feature is shown in Fig. [22j along with plots of the needlet significances 
S triggering the Bayesian analysis step, collision templates for the marginalized parameter constraints found in each 
case, and the CMB sky as it would appear with these template contributions removed. 

To confirm that these features are not due to residual foregrounds, we have also applied our suite of needlet 
transforms to the WMAP 7-year Q (41 GHz) and V (61 GHz) band foreground-reduced maps. Taking all of the 
needlets which generate a significant response for the four most significant features, we calculate the average of the 
needlet coefficients within the regions described by the estimated bubble templates. The results are plotted in Fig. [23] 
We show, for each blob forming part of a feature, the W-band-normalized needlet coefficient averages given by 

A/o _ ^jfe,Q/V - Pjk,W . . 

^Pj/c,Q/V - 5 , ['^V 

Pjk,W 

where ^jk,Q/v/w is the pixel- averaged needlet coefficient value in a given WMAP frequency band. The plots are 
consistent with no change in the strength of the signal with frequency, suggesting that the features are not due to 
foreground contamination. 



VII. CONCLUSIONS AND OUTLOOK 



An exciting opportunity to confront the eternal inflation scenario with experiment lies in the observation of collisions 
between other bubble universes and our own. In this paper, we have described an algorithm to search for the imprint of 
bubble collisions on the cosmic microwave background, and applied it to the WMAP 7- year data. Our search algorithm 
targets the generic signatures expected from bubble collisions: azimuthal symmetry, long-wavelength modulation of 
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TABLE VIII. Results of the Bayesian parameter estimation and model selection analysis for the WMAP 7-year data. Reported 
error bars are at 68% CL. Angular quantities are reported in degrees. The ranges of ^crit are determined from the needlet 
response (see Table |l]). By computational necessity, the evidence integral is truncated at 11°. Hence, an evidence ratio for 
feature 1 could not be calculated as its ^crit range lies entirely beyond this upper bound. The angular positions {^o,0o} can 
be related to Galactic coordinates through longitude lo = 00 and latitude bo = 90° — Oq. 




FIG. 21. Full-sky map showing the positions and sizes of the four features with largest evidence ratios, alongside the 7- year 
KQ75 sky cut. Feature 2 is plotted in orange, feature 3 in red, feature 7 in light blue and feature 10 in light green. 



the temperature confined to discs on the sky, and circular temperature discontinuities. For this reason, we expect our 
analysis to be fairly robust under changing assumptions about the underlying theory, which is presently rather poorly 
understood. 

The analysis pipeline we have developed takes a two-pronged approach, applied in parallel. The first uses heuristic 
techniques to test for the presence of features specific to bubble collisions. The second is a fully Bayesian algorithm 
for the general problem of non-Gaussian source detection, implemented as a patch-wise approximation to the full-sky 
model selection and parameter estimation problem. The data set is segmented in a completely automated way, allowing 
us to avoid a posteriori selection effects associated with choosing the most "interesting" features on the CMB sky by 
hand. The algorithm is tested and thresholds at each step are calibrated using extensive simulations, and then frozen 
before ever looking at the data, to follow as much as possible the philosophy of a blind analysis. Candidate collisions 
are identified from an input temperature map based on the response to a suite of needlet transforms (calibrated 
using simulations with and without bubble collisions), and grouped into "blobs." These blobs are scrutinized for 
circular temperature discontinuities using an edge detection algorithm. The quantitative significance of an edge is 
characterized using the Circular Hough Transform (CHT). The blobs are also used to construct an approximation 
to the full-sky Bayesian parameter estimation and model selection problem for bubble collisions. The posterior 
probability distribution over the expectation value for the number of detectable collisions, TVg, is then obtained. This 
allows us to quantify which of the two models - a theory whichpredicts on average TVg bubble collision signatures 
described by temperature modulations of the form given in Eq. pj or else the standard model (specified by TVg = 0) 
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FIG. 22. Maps of the four features with largest evidence ratios. The top row shows the W-band temperature map in the locahty 
of the four features, masked with the KQ75 mask. Overlaid are circles indicating the estimated position and angular scale found 
in each case. The second row contains plots of the masked needlet significances for the needlets whose ^crit priors produced the 
largest evidence ratios. These plots appear pixelated as the blob detection step is carried out at reduced resolution. The third 
row shows the bubble collision templates corresponding to the estimated model parameters; these templates are subtracted 
from the W-band data in the fourth row. The width of each plot is ~ 16.7°. 



with CMB plus realistic noise and beam effects - better explains the data. 

Applying our analysis pipeline to simulations, we have found that a circular temperature discontinuity at the causal 
boundary is a clear signature of bubble collisions Although our analysis can identify collisions without temperature 
discontinuities, their presence greatly increases our ability to make a conclusive detection. Both the edge-detection 
and Bayesian model selection steps have the ability to identify a causal boundary in the patches of the sky that are 
highlighted as candidate collisions by the blob detection step of our analysis pipeline. We have found no evidence 
for circular temperature discontinuities in the WMAP 7-year data using either method. Based on our analysis of 
simulations, this allows us to rule out the presence of collisions in the exclusion region of Fig. 17 For collisions larger 



^'^ The observational detection of a circular temperature discontinuity is so unlikely to arise spuriously that it provides conclusive evidence 
of a detection. 
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FIG. 23. WMAP channel frequency dependence of the features highUghted by our pipeUne. The W-band-normaUzed difference 
in pixel- averaged needlet coefficients between the Q and W bands (triangles) and V and W bands (diamonds) are plotted for 
each blob making up a feature, as highlighted by the full suite of needlet transforms. 



than 6>crit ^ 10°, this corresponds to 10^|zcrit| ^ 3-6 for the amplitude of the circular temperature discontinuity 
defined in Eq.[4j For collisions on smaller scales, the CHT step loses sensitivity due to the proliferation of degree-scale 
blobs in the background CMB fluctuations. 

The posterior evaluated using the WMAP 7- year data is maximized at TVg = 0, and constrains TVg < 1.6 at 68% 
confidence. We therefore conclude that this data set does not favor the bubble collision hypothesis for any value of N^. 
In light of this null detection, comparing with the simulated bubble collisions, we can constrain the central amplitude 
of the temperature modulation caused by the collision (defined in Eq. [4| to be ^ 1 x 10~^ over the range of scales 
^crit ^ 5° we have simulated. If the collision is described by a single super-Hubble wavelength mode confined to a 



disc on the sky, from Eq. 12 we can use these bounds (with the largest collision size we have simulated: ^crit = 25°) 



to constrain 11^^^ ^(0) < 7 x 10 ^ (where Qk is the present component in curvature and ^(0) is the initial magnitude 



of the Newtonian potential caused by the collision). More generally, Eq. 13 bounds the nucleation rate of bubbles in 
our parent vacuum, provided gravitational waves and negative curvature are observed with future experiments. 

Although we have obtained a null result, our analysis pipeline has identified four features in the WMAP 7- year data 
that have Bayesian evidence ratios that are significantly larger than expected for false detections from an end-to-end 
simulation of the WMAP experiment. Two of these features (features 2 and 3) have been noted in previous literature. 
Feature 2 corresponds to the WMAP Cold Spot [65] (see Ref. [66] for a review of its properties), and feature 3 was 
identified using standard needlets in Ref. [48]. All four features are far from the Galactic cut of the KQ75 7- year 



mask (see Fig. [2l|, and none appear to be responses to the point source components of the mask (see Fig. [22|. We 
have confirmed that the signal in each case is not strongly dependent on the frequency band used (see Fig. [23] ), 
providing evidence that these features are not due to astrophysical foregrounds. A number of analyses, most recently 
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the redshift analysis of Ref. [67], suggest that the Cold Spot is primordial and not associated with the integrated 
Sachs- Wolfe effect of a large void. Further studies of the other three features would be needed to confirm that they 
are truly primordial. 

Our ability to detect bubble collisions will improve greatly with data from the Planck satellite. Decreased instru- 
mental noise will enlarge the exclusion and sensitivity regions in parameter space for the needlet step of the analysis, 
as evidenced by our ability to detect more simulated collisions in low-noise regions of the WMAP data. The threefold 
increase in resolution will greatly improve our ability to detect circular edges. In addition, the polarization data from 
Planck will be of sufficient resolution to look for complementary signatures of bubble collisions [26, 68 . Such an 
analysis should be able to confirm if the features we have identified are in fact bubble collisions. 

It is also important to determine if other theories predicting azimuthally symmetric features in the CMB [60 11691171] 
are better fits to the data. The blob and edge detection steps in our analysis pipeline are sensitive to a variety of 
possible signatures, and given a model, the Bayesian model comparison step could be easily tailored to accommodate 
different forms of the temperature modulation. Because our pipeline is automated, we can compare the evidence 
ratios obtained for different models to decide which is a better fit, without recourse to a posteriori choices of which 
features to analyze. 

In conclusion, we have presented a powerful algorithm for analyzing CMB data for signatures of bubble collisions. 
Applying this pipeline to the WMAP 7-year data, we have constrained the possible parameter space of bubble 
collisions, as well as identifying interesting candidate signatures in the data for further investigation. Future data 
from the Planck experiment will allow us to greatly improve on these results. If confirmed, the presence of bubble 
collisions in the CMB would be an extraordinary insight into the origins of our universe. 



ACKNOWLEDGMENTS 



We are very grateful to Eiichiro Komatsu and the WMAP Science Team for supplying the end-to-end WMAP 
simulations used in our null tests. SMF is supported by the Perren Fund. MCJ acknowledges support from the 
Moore Foundation. HVP is supported by Marie Curie grant MIRG-CT-2007-203314 from the European Commission, 
and by STFC and the Lever hulme Trust. MCJ and HVP thank the Aspen Center for Physics, where this project 
was initiated, for hospitality. HVP and DJM acknowledge the hospitality of the Statistical Frontiers of Astrophysics 
workshop at IPMU, Tokyo. SMF thanks Filipe Abdalla, Ingo Waldmann and Michael Hirsch for useful discussions, 
and Jenny Feeney for proofreading the manuscript. MCJ thanks Rebecca Danos for discussions regarding the edge 
detection algorithm. HVP thanks Andrew Pontzen for interesting discussions regarding the general problem of CMB 
anomaly hunting. We acknowledge use of the HEALPix package and the Legacy Archive for Microwave Background 
Data Analysis (LAMBDA). Support for LAMBDA is provided by the NASA Office of Space Science. Research at 
Perimeter Institute is supported by the Government of Canada through Industry Canada and by the Province of 
Ontario through the Ministry of Research and Innovation. A preprint version of this paper presented only evidence 
ratios confined to patches. We thank an anonymous referee who encouraged us to develop this algorithm into a 
full-sky formalism. This calculation is now presented in Appendix A and incorporated into our analysis pipeline and 
results. 



C. L. Bennett et al (WMAP), Astrophys. J. 583, 1 (2003)| |arXiv:astro-ph/0 301158| 

J. A. Tauber, N. Mandolesi, J. Puget, et al, AkA 520 (2010), 10.1051/0004-6361 /200912983 
L. Susskind, in Universe or Multiverse (Cambridge University Press, 2003). 
A. Aguirre, in Beyond the Big Bang (Springer, 2008). 

M. Bucher, A. S. Goldhaber, and N. Turok, Phys. Rev. D52, 3314 (1995), h ep-ph/9411206| 

J. R. Gott, Nature 295, 304 (1982). 

J. Garcia-Bellido, (1997), arXiv:hep-ph/9803270j 

S. R. Coleman, Phys. Rev. D15, 2929 (1977) 

S. R. Coleman and F. De Luccia, Phys. Rev. D21, 3305 (1980)1 

A. H. Guth and E. J. Weinberg, Phys. Rev. D23, 876 (1981). 

A. H. Guth and E. J. Weinberg, Nucl. Phys. B212, 321 (1983)J 

J. R. Gott and T. S. Statler, Phys. Lett. B136, 157 (1984).^ 

J. Garriga, A. H. Guth, and A. Vilenkin, Phys. Rev. D76, 123 512 (200 7)| [ar" Xiv:hep-th/0612242| 

A. Aguirre, M. C. Johnson, and A. Shomer, Phys. Rev. D76, 063509 (2007),^x X'iv:0704.3473 [hep-thj 

S. W. Hawking, L G. Moss, and J. M. Stewart, Phys. Rev. D26, 2681 (1982) 

Z.-C. Wu, Phys. Rev. D28, 1898 (1983). ^ 

A. Aguirre and M. C. Johnson, Phys. Rev. D77, 123536 (2008)| [arXiv:0712.3038 [hep-th]| 



37 



A. Aguirre, M. C. Johnson, and M. Tysanner, Phys. Rev. D79, 123 514 (2009)] 
S. Chang, M. Kleban, and T. S. Levi, JCAP 0804, 034 (2008) 

S. Chang, M. Kleban, and T. S. Levi, [ JCAF 0904, 025 (2009), arX iv:0810.5128 [hep-th]| 

A. Dahlen, Phys.Rev. D81, 063501 (2010), arXiv:0812.0414 [hep-th] 

B. Freivogel, M. Kleban, A. Nicolis, and K. Sigurdson, JCAP 0908, 036 (2 009) | 

R. Easther, J. T. Giblin, Jr, L. Hui, and E. A. Lim, Phys. Rev. D80, 123519 (2 009)] |arXiv:0907.3234 [hep-th]| 
K. Larjo and T. S. Levi, JCAP 1008, 034 (2010), arXiv:0910.4159 [hep-th] 

J. Zhang and Y.-S. Piao, P hys.Rev. D82, 04 3507 (2010), arXiv: 1004.2333 [hep-th] 

B. Czech, M. Kleban, K. Larjo, T. S. Levi, and K. Sigurdson, (2010), arXiv:1006.0832 [astro-ph.CO]] 
A. Aguirre and M. C. John son, (20 09), arXiv:0908.4105 [hep-th] 

N. Kaiser and A. Stebbins, Nature 310, 391 (1984) 

A. S. Lo and E. L. Wright, (2005), arXiv:astro-ph/0503120l 

R. J. Danos and R. H. Brandenberger, Int. J. Mod. Phys. D19, 183 (2010), arXiv:081 1.2004 [astro-ph]' 
S. Amsel, J. Berger, and R. H. Brandenberger, JCAP 0804, 015 (2008) , arXiv:0709.0982 [astro-ph], 
N. Jarosik et a/., (2010), arXiv: 1001. 4744 [astro-ph.CO] 

S. M. Feeney, M. C. Johnson D. J. Mortlock, an d H. V. Pe iris, (2010), [aTxTv: 1012. 1995 [astro-ph.CO]] 

C. L. Bennett et al, (2010), |arXiv:1001.4758 [astro-ph.CO]] 

Y. Sekino, S. Shenker, and L. Susskind, Phys. Rev. D81, 123515 (2010) | [alXiv: 1003. 1347 [hep-th]| 
E. Komatsu et a/., (2010), arXiv: 1001. 4538 [astro-ph.CO] 

B. Freivogel, M. Kleban, M. Rodriguez Martinez, and L. Susskind, JHEP 03, 039 (2006). 
A. De Simone and M. P. Salem, Phys.Rev. D81, 083527 (2010), arXiv:0912.3783 [hep-th] ^ 

C. Gordon, W. Hu, D. Huterer, and T. Crawford, Phys. Rev. D 72, 103002 (2005) 

J. T. Giblin, Jr, L. Hui, E. A. Lim, and L-S. Yang, Phys. Rev. D82, 045019 (2010), arXiv : 1005.3493 [hep-th]] 
M. C. Johnson and L-S. Yang, Phys. Rev. D82, 065023 (2010), arXiv:1005.3506 [hep-th] 
M. S. Turner, Phys. Rev. D44, 3737 (1991) 

A. L. Erickcek, S. M. Carroh, and M. Kamionkowski, Phys. Rev. D78, 08301 2 (2008)1 |arXiv:0808. 1570 [astro-ph]| 



J. P. Zibin and D. Scott, |Phys. Rev. D78, 123529 (2008), arXiv:0808.2047 [a stro-ph|[ 

J. Garcia-Bellido, A. R. Liddle, D. H. Lyth, and D. Wands, , Phys. Rev. D52, 6750 (1995)] |arXiv:astro-ph/9508003l 

D. Marinucci et al, MNRAS 383, 539 (2008). 

D. Pietrobon, A. Balbi, and D. Marinucci, Phys. Rev. D74, 043524 (2006), arXiv:ast ro-ph/0606475| 
D. Pietrobon et a/., Phys. Rev. D78, 103504 (2008), arXiv:0809.0010 [astro-ph] 

P. Baldi, G. Kerkyacharian, D. Marinucci, and D. Picard, ArXiv Mathematics e-prints (2006), [arXiv:math/0606599| 

F. Guilloux, G. Fay, and J^-F. Cardoso, (2007), arXiv:070 6.2598 [math.CA] 

S. Scodell er et al, (2010), [arX iv: 1004. 5576 [astro-ph.CO] 

J. Canny, IEEE Trans. Pattern Anal. Mach. Intell. 8, 679 (19867 

C. Kimme, D. Ballard, and J. Sklansky, Commun. ACM 18, 120 (1975)] 
F. Feroz, M. P. Hobson, and M. Bridges, MNRAS 398, 1601 (2009). 

D. Larson et a/., (2010), arXiv: 1001.4635 [astro-ph.CO] 

K. M. Gorski et a/., Astrophys. J. 622, 759 (2005), arXiv:astro-ph/0409513 

A. de Oliveira-Costa and M. Tegmark, Phys. Rev. D74, 023005 (2006) , arXiv:ast ro-ph/0603369| 



N. E^ Groeneboom, L. Ackerman, L K. Wehus, and H. K. Eriksen, Astroph ys. J. 722, 452 (2010)[ |arXiv:0911.0150 

[astro-ph. COjl. 

J. Hoftuft et g/., I Astr ophys. J. 699, 985 (2009) | [aFXiv:0903. 1229 [astro-ph.CO ]! 



M. Cruz, E. Martmez-Gonzalez, P. Vielva, J. M. Diego, M. Hobson, and N. Turok, [MNRAS 390, 913 (2008) 

arXiv:0804.2904 

A. Pontzen and H. V. Peiris, [Phys. Rev. D81, 103008 (2010)] [arXiv: 1004.2706 [astro-ph.CO] 



R. T. Cox, Am. J. Th. Phys 14, 1 (1946) 
J. Skilling, in AIP Conference Proceedings of the 24th International Workshop on Bayesian Inference and Maximum 
Entropy Methods in Science and Engineering, Vol. 735 (2004) pp. 395-405. 
A. Lewis and S. Bridle, Phys. Rev. D66, 103511 (2002), arXiv:astro-ph/0205436^ 



M. Cruz, E. Martinez-Gonzalez, P. Vielva, and L. Cayon, |Mon. Not. Roy. Astron. Soc. 356, 29 (2005)| |arXiv:astra 
ph/0405341 



M. Cruz, E. Martinez-Gonzalez, and P. Vielva, in Highlights of Spanish Astrophysics V (2010) arXiv:0901.1986 [astro-ph] 
M. N. Bremer, J. Silk, L. J. M. Davies, and M. D. Lehnert, ArXiv e-prints (2010), arXiv: 1004.1 178J 

C. Dvorkin, H. V. Peiris, and W. Hu, Phys. Rev. D77, 063008 (2008), arXiv:0711.2321 [astro-ph] 

N. J. Cornish, D. N. Spergel, and G. D. Starkman, Class. Quant. Grav. 15, 2657 (1998), arXiv:ast ro-ph/9801212 
N. Afshordi, A. Slosar, and Y. Wang, (2010), arXiv:1006.5021 [astro-ph.CO] 



E. D. Kovetz, A. Ben-David, and N. Itzhaki, Astrophys.J. 724, 374 (2010 JQarXiv: 1005.3923 [astro-ph.CO] 

i\/r JD u^u^^^ i\/r^T^^ui^^ i\/n\TTZ)AC ooo /'nnno\ — v:,..^^-*- — /non/i /i k^i 



M. P. Hobson and C. McLachlan, MNRAS 338, 765 (2003), arXiv:astro-ph/0204457' 
M. P. Hobson, G. Rocha, and R. S. Savage, "Bayesian source extraction," in Bayesian Methods in Cosmology, edited by 
Hobson, M. P., Jaffe, A. H., Liddle, A. R., Mukeherjee, P., k Parkinson, D. (2010) pp. 167-+. 



38 



Appendix A: Statistical formalism 
1. Posterior 

In this appendix, we discuss how Bayesian parameter estimation and model selection for theories which predict 
localized sources can be approximated by a patch- wise analysis. Consider astronomical observations covering solid 
angle l^obs = 47r/sky that are of sufficient depth/resolution to identify sources with a particular range of properties 
(which can then be deemed "detectable"). Given a theory that predicts an expectation value of sources over the 
whole sky, we want to know both: what constraints the available data place on N^] and whether the data favour a 
model which predicts one value of iVg over another, all the relevant information is encoded in the posterior distribution 
Pr(7Vs|d, /sky), where d are the pixelized flux or temperature measurements (and, optionally, any statistics derived 
from them). Bayes' theorem allows the posterior to be written as 

where Pr(7Vs) is the prior distribution on TVg, Pr(d|7Vs, /sky) is the likelihood of getting the observed data given the 
area of observation and the expected number of sources, and Pr(d|/sky) ensures that the posterior is normalized 
over A/g. Constraints on TVg can be drawn directly from this normalized posterior; the relative probability of models 
predicting different values of TVg can be found by picking out the posterior at two values of N^: 

Pr(7Vg,i|d,/gky) ^ Pr(7V,,i)Pr(d|7V,,i,/gky) 

Pr(iVs,2|d, /sky) Pr(iVs,2)Pr(d|iVs,2, /sky) ' ^ ^ 

In the absence of a prescriptive theory, it is useful to emphasize the role of the data, which can be done by adopting 
a flat prior on N^] further assuming that the data will give an upper limit on TVs, it is possible to adopt an improper 
uniform prior Pr(iVs) = 0(iVs) without any high-TVs cut-off. The resultant posterior has the form 

Pr(iV,|d, Uy) oc e{Ns) Pr(d|iV„ Uy), (A3) 

up to a normalization constant that depends on the data and /sky but not on TVs- 

In general TVs is not directly measurable, even for perfect data, because the number of sources present in the 
observable sky, A^s, is the realization of a Poisson-like process (of mean /sky^s)- The possibility that A^s is itself 
subject to some uncertainty (e.g., due to noisy data or confusion problems) can be incorporated by marginalizing over 
to give 

oo 

Pr(d|iV„/,ky)= Pr(A^sl^s,/sky)Pr(d|7V„/,ky) 

Ns=0 



= E 



Pr(d|TVs,/sky), (A4) 



where the second formula explicitly assumes that the number of observable sources is drawn from a Poisson process. 



Inserting this second expression into Eq. A3 then gives 

Pr(TVs|d, /sky) (X 6(TVs) e"^-^^^ £ ifs^yNs^^ Pr(d|Ar„ /,ky). (A5) 

The form of the likelihood Pr(d|A/'s, /sky) is treated largely in abstract here, with the specific details of the likeli- 
hood calculation for the bubble collision hypothesis given WMAP 7- year data described in Sec. |V C Assuming the 



measurements take the form of flux/counts at different positions on the sky (as for a CMB experiment) and that they 
are subject to (possibly correlated) Gaussian noise, the likelihood would have the form 

Pr(d|Ar„/,ky) = J dmi...dm^3 (A6) 

where t(m) is the data template that would result from a source whose position and profile/scale are defined by the 
model parameters m, Pr(m5) is the prior distribution of source parameters for "detectable" sources, and C is the 
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pixel-pixel covariance matrix of the non-source noise (which could include contributions that are considered signal in 
other contexts, such as the CMB). 

Evaluating the full sum in Eg. |A5| is not always practical or even possible. In addition, the evaluation of individual 
terms in this sum will be computationally limited by the size of the covariance matrix C and the cost of performing 
the integral over model parameters for each template. However, it is possible to circumvent these problems, and 
estimate the posterior Eq. A5 if one knows in advance some of the properties of the integrand in Eq. A6 

To see how this works, assume that one has located a set of A/'b "blobs" on the sky that are candidate sources. 
Segment the sky into = + 1 regions consisting of those containing blobs, and the rest of the sky. Given A/'b, we 



can now evaluate Eq. A5 term- by-term. The likelihood in the first term, for A^g = 0, is simply given by: 



Pr(d|0,/sky) 



(27r)^px/2|C| 



-dC"'d^/2 



which is the likelihood for the null- hypothesis with no sources. Moving on to the 
integral over source positions to cover each of the A^r regions: 



(A7) 



1 term, we first expand the 



Pr(d|l,/sky) 



E 

r=l 



dmPr(m) 



1 



region r 



(27r)^px/2|C| 



-[d-t(m)]C ^[d-t(m)]'^/2 



(A8) 



We now assume that the blobs containing candidate sources include all of the significant contributions to the integral, 
and replace A'r in the sum by A^b- This will give us a lower bound on the likelihood, even if a number of actual sources 
are not contained within the blobs defined by the candidate sources. We further assume that sources do not overlap. 
If the covariance matrix is small enough to invert, we could stop here. However, in cases where the covariance matrix 
is too large to feasibly invert (as is the case for the WMAP 7 year data), we can make one further approximation: 



1 



-[d-t(m)]C '[d-t(m)]'^/2 



/ dmPr(m) JjL^(m), 

J region b , i 



(A9) 



where the product is over A'r disjoint regions on the sky and 



Lr{m) 



1 



(2^)^px/2|C, 



-[d,-t,(m)]C^ '[d,-t,(m)]'^/2 



(AlO) 



is the contribution to the likelihood of data in region r, defined in terms of the covariance of the pixels in this region, 
C^, the data in this region, d^, and the source template in this region, t^(m). This is exact for a diagonal covariance, 
but is only approximate in the case where there are off-diagonal elements. Using the assumption that the integral has 
a significant contribution only inside the blobs, in the rest of the sky we can replace t = 0. The one-source model 
likelihood then becomes 



Pr(d|l,/,ky)^ Vn^-(%^' 



6=1 r=l 



where 



Pb 



/region 6 



This is the evidence ratio for a single source template centered in region b. 
For a general number of blobs and sources, the model likelihood is 



(All) 
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Pr(d|7V„/,Uy) 



Z_-/6i ,62,.--)' 



if Ns > ATb, 



(A13) 



where the combinatorics require some explanation. If there are fewer blobs on the sky than proposed sources then 
the likelihood is very small: by assumption, the likelihood evaluated outside of a blob is small. If there are at least as 
many blobs as proposed sources, then the likelihood takes the form of a sum that includes every possible association 
of the A^s sources with the A/'b blobs, provided that no two sources are matched to the same blob. Hence the multiple 
sum generates all possible combinations of source-blob associations and the product over evidence ratios gives the 
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relevant weightings; the product over delta functions removes the terms in which any two sources are attached to the 
same blob. 



Inserting the likelihood given in Eq. |A13|into Eg. |A5| yields the unnormalized posterior on N^: 



Pr(iVs|d,/sky)ae(iV,)e- 



/sky 



AT, ^ (/sky^s)^' 



E 



s=l i,j = ^ 



under the assumption that 



Pr(d|0,/,ky) = n^'-(O)' 



(A14) 



(A15) 



in which case regions that do not contain a blob are irrelevant for determining the posterior. Eq. A14 is the main 
result of this calculation, from which all following results can be derived. In the limit of a single isolated observation 
Eq. A14 reproduces the Bayesian source detection formalism developed in [72l [73]. 



2. Special cases 

a. Perfect data 



With infinite, perfect data the number of sources on the sky would be di rectly determined by counting the TVb 
"blobs" in the data and so Pr(d|A/'s, /sky) = ^Ns,n-^- The posterior in Eq. A14 would become 



(A16) 



the standard result for constraining a rate variable from a single measurement, modified slightly to account for the 
fact that the constraint on iVg is weakened if /sky <C 1. In the even more particular case that no blobs were detected 
in perfect data, the posterior would be 



Pr(iVs|0,/sky) = e(iVs) /skye-^^^-^^ 
If a single blob was detected unequivocally then the posterior would be 

Pr(7V,|l,/,ky) = e(7Vs)/sky(/skyiVs)e-^^^-^^ 
If two blobs were detected unequivocally then the posterior would be 



(A17) 
(A18) 

(A19) 



b. No blobs 



If there are no identified blobs then A^t = and there is no evidence for any sources at all. This is really a weaker 
constraint than the above situation if the data are perfect, but in the approximation used here the final result is the 
same. In Eq. A14 the first sum is truncated at the first term and so 



Pr(iV,|0,d,/,ky) = e(iV,)/,uye 



-/skyA/'s 



(A20) 



matching Eq. A17 



Adopting a fiat prior on A^s, the posterior probability ratio for a model predicting a generic A^s > versus one 
predicting no collisions in this case is given by 



Pr(iV,|0,d,/,ky) 
Pr(0|0,d,/,ky) 



(A21) 



This is always less than one, and so as expected, a theory which predicts sources on the sky is always disfavoured 
when compared to a theory that predicts no sources on the sky. 
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c. One blob 

Probably the most important simple case is where there is a single identified blob, which might represent a first 



detection of this class of source. Inserting N\y = 1 into Eq. |A14[ the sum includes the possibilities of either one source 
on the (observed) sky or no sources; the posterior evaluates to 



Pr(iV,|l,d,/,ky) = e(iV,)/,kye- 



(A22) 



l + Pb 

In the limit that the data in this region are much better fit by a source then p5 ^ 1, and the posterior becomes 

Pr(iV,|l, d, /,ky) = e(iV,) /.ky(/skyiV,)e-/^^-^% (A23) 



which matches A18 above. Conversely, in the limit that the source is a worse fit to the data (possible given that the 
source has been forced to be detectable), then p5 <C 1 and 



Pr(7Vs|l,d,/sky) = 6(7Vs)/skye- 



(A24) 



matching Eq. A17 which was obtained under the assumption that there was no blob in the first place. 

Adopting again a flat prior on TVg, the posterior probability ratio for a model predicting a generic > versus 
the no-bubble case is given by 



Pr(iVs|l,d,/sky) 

Pr(0|l,d,/,ky) 



-UyN. (1 + /skyiV, 



Pb 



(A25) 



Here, it can be seen that two things are necessary to favour the theory with sources given one detection: ~ 
and pi^:^ 1. 

d. Two blobs 

If two blobs are identified then the sum in|A13|has three terms, for which the likelihoods are: 



Pr(2,d|0,/sky) = n^-(0). 



(A26) 



r=l 



Nr 



Pr(2,d|l,/sky) = [Pb^^Pb,]l[Lr{0) 



r=l 



and 



Pr(2,d|2,/sky) = P6,P62n^-W- 



(A27) 
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Hence the (unnormalized) posterior is 

Pr(iV,|2,d, /,ky) cx eiNs)e-f^^y^' {l + /,kyiV, [pb, + Pb,] + { f sky Nsfpb.Pb,} ■ 



(A29) 



In the limit that the evidence for both sources is strong (i.e., pfc^ 3> 1 and pb^ ^ 1) then the third term in the curly 
braces dominates and 



Pr(iV,|2,d,/,ky) = e(iV,):^(/,kyiV,)2e-/-.^=, 



(A30) 



which matches the perfect data case with N\y = 2, as expected. In the limit where one blob is a false candidate, but 
the other yields a strong evidence (e.g., ^ 1 and ^ 1), then we recover the perfect data case with Nt^ = 1. 



